Guest post: Large language models can make legal judgments fairer – if we use them properly

By Dawid Kotur, CEO and founder, and Nick Long, CTO, at Curvestone

Whether it’s seen as a positive development or not, generative AI, especially large language models (LLMs) like ChatGPT, are set to reshape the legal landscape dramatically. Experiment with ChatGPT, and it soon becomes evident that its abilities stretch beyond tasks like summarising lengthy texts, formatting and authoring documents, or extracting specifics from contracts. Using such AI tools in this way feels like having an army of clerical assistants at your disposal. Yet, it’s not just these capabilities that are transformative. As we learn more as an industry, the potential uses of LLMs could go much further into the realms of human judgment.

The “Chain of thought” promise 

Using AI to emulate human judgment isn’t new, but previous technological limitations stymied their adoption. A common test of AI is “Legal Judgment Prediction” (LJP). Here, an AI is shown the facts of a case and is tested on its ability to predict the judgment of a human expert. Until recently, Deep Learning was the state of the art – but while powerful, it had two flaws: it required a lot data from similar cases annotated by humans, meaning lots of time and money going into manual work, and it couldn’t always explain its reasoning—much like Douglas Adams’ fictional computer that declared the meaning of life as “42” but couldn’t explain why.

Experiments with ChatGPT have shown that it is able to perform LJP by applying legal syllogism in a similar manner to a human and that it achieves the most accurate results when using a technique called “Chain of thought”. In that scenario, the reasoning is performed step by step, and each step is explained to the user. This method aligns with practices familiar to many legal professionals, making LLMs like ChatGPT more suitable for real-world tasks than any previous tech tools.

Situations in which LLM-based judgments will be particularly beneficial include when a high volume of cases must be considered in a short time, such as employment tribunal cases, when the cases are consistent and when there is relatively low novelty in the type of judgments applied. Adoption of technologies supporting faster and more effective judgments cannot come too quickly considering there is, for instance, a backlog of over 50,000 employment tribunal cases in the UK. People who are uncomfortable with AI judgment may feel happier if it means their case will be dealt with quickly rather than staying in limbo for long periods.

Finding the appropriate use case

The ability of AI to simulate human judgment is impressive, but we must tread carefully. Over-relying on machines for pivotal legal decisions that impact human lives without oversight would be very dangerous. The most appropriate and promising uses for automated judgment is where it can support legal professionals to work more efficiently and where it can reduce inconsistency and bias.

Even people highly trained and experienced in making judgments have been shown to demonstrate inconsistency. Perhaps the most famous example is the 2011 study that showed judges were far more likely to grant parole if they had recently eaten compared to if they were hungry. This is an area where AI can step in, if it can help rectify human bias.

AI tools in the legal field can serve a dual purpose: first, they can provide preliminary guidance, laying a foundation for human professionals to review and refine. Second, these tools can be employed retrospectively to identify and correct any oversights or biases that might have slipped in unconsciously. The use of AI to support legal judgments is already being seen as building trust and traction: in a July 2023 speech given by Sir Geoffrey Vos to the Bar Council of England and Wales, he said: “In […] disputes, such as commercial and compensation disputes, parties may come to have confidence in machine made decisions more quickly than many might expect.”

Watching out for biases and accessibility risks

While AI promises consistency and repeatability in its results, we need to remember that its outputs, like any tool, aren’t inherently without bias.

A common misconception is that AI, dealing in “pure logic”, is inherently unbiased. However if an AI is trained on biased data, it will inevitably produce biased results. Notably, some AI systems predicting reoffending risks displayed racial biases. This proves that whenever an AI algorithm is developed to support legal judgment, it is critical that extensive bias testing is carried out on an ongoing basis. That being said, while LLMs might not be the magic bullet against bias, their transparent reasoning offers more verifiability than previous opaque algorithms. For instance, you could fairly easily ask LLMs to provide the demographics of the answers they are giving, and check if certain, sexes, races, religions etc, were over represented.

What’s also remarkable about modern AI tools is their accessibility—anyone can use them for pretty much anything. This poses risks: given the significant time savings these tools offer to busy professionals, it’s inevitable that some individuals will opt to use them, even if they are prohibited in certain workplaces. Instances of “Shadow IT”, where employees use unauthorised tools like ChatGPT to circumvent IT policy, are on the rise. In the courtroom specifically, this can lead to instances of litigants  presenting fictitious submissions based on answers provided by Chat GPT. It’s therefore crucial to ensure professionals across the legal profession understand the right use and potential pitfalls of AI tools.

The promise of using LLMs like Chat GPT in the legal realm is big. But as the industry goes deeper in its applications, we have a collective responsibility to ensure user education, bias monitoring and overall responsible deployment.

Curvestone, founded in 2017, specialises in designing and building digital products the utilise AI. To learn more see http://curvestone.io/

We only publish guest posts on merit, never advertorials. To submit an idea for a guest post contact caroline@legaltechnology.com