Artificial Intelligence and International Criminal Law

Written by

Artificial Intelligence (‘AI’), particularly large language models (‘LLMs’) such as Open AI’s ChatGPT, Bing Chat, and Google Bard, pose clear risks when lawyers use them without understanding their limitations. One need look no further than recent news, where an attorney incurred the wrath of a New York federal judge for submitting a brief with cases invented by ChatGPT.

But technology should not be dismissed just because it can be misused. These kinds of tools could revolutionize the legal profession, and a plethora of legal technology start-ups and scholarly contributions have sprouted about them.

The potential of LLMs remains largely untapped in international criminal law (ICL). How these tools reshape ICL will hinge on their ability to navigate its unique features.

Particular Features of ICL

International criminal litigation is largely guided by complex documents. The monumental investigations needed to determine the occurrence of international crimes often result in a massive influx of documentary materials. LLMs are designed to comprehend large volumes of data, but their effectiveness can be hampered by poor-quality scans and documents in languages not commonly used in their training.

Moreover, ICL jurisprudence is diverse. Each ICL institution represents a jurisdiction with its unique repositories for storing its jurisprudence. Despite the existence of centralised compilations like the ICC Legal Tools Database, no publicly available AI tool has been specifically trained on these compilations. In contrast, American lawyers can more easily integrate AI into their domestic practice through various avenues like Westlaw Edge, Lexis +, and Casetext.

ICL also faces significant data security risks. Due to the novelty of LLMs, their security implications remain largely undefined, posing potential threats. Any data breaches in the prosecution of war crimes or crimes against humanity could have catastrophic consequences, such as the identification and targeting of victims, witnesses, and others at risk.

The Present and Future of LLMs in ICL

Nonetheless, LLMs offer immense benefits to ICL lawyers even now, not just in the distant future. These models can transform how lawyers function if appropriately understood and applied (see here for a general book-length survey).

Employing LLMs demands a paradigm shift from traditional search engine queries and Boolean search terms. Lawyers need to engage in dialogue with these models, and adeptly crafting prompts can significantly enhance the quality of responses. An upcoming article by Daniel Schwartz and Jonathan H. Choi provides an excellent overview on how to prompt LLMs in legal contexts.

It also must be emphasized here that I am an ICL-trained lawyer and not a legal tech expert. At the moment I am more of an enthusiast than a skilled user of these tools. It is very likely that my description of LLMs and their potential can be refined by more tech-savvy individuals. Those interested in the technical aspects of how LLMs are developed should consult this May 2023 video presentation by Andrej Karpathy of OpenAI.

Applications of LLMs in ICL

  1. Jurisprudence/filing summaries

Perhaps the most immediate application of LLMs in international criminal law is its ability to summarise jurisprudence, or parts thereof. The quality of the summary depends on how the prompting questions are asked. Asking any prevailing LLM to ‘summarise the Ongwen Appeals Judgment on the crime of forced marriage’ yields a generic, unhelpful summary. But copying and pasting the relevant parts of the Appeals Chamber’s judgment on forced marriage and asking the LLM to ‘summarise the following passage of the Ongwen Appeals Judgment: “[insert paragraphs]”’ yields a much higher quality summary. Putting quote marks around the text better allows for meaningful follow-ups with the LLM, as further prompts can ask for answers to be given with direct quotes from the passage provided.

Obstacle for further growth? Character limits. Right now a ChatGPT4 subscription user has a character limit of 4000 characters. For Bing Chat the character limit is only 2000 characters. The recent Stanisic and Simatovic Appeals Chamber judgment from the IRMCT is 280 pages and over 700,000 characters…and ICL judgments can be substantially longer than that. GPT add-ons are being developed which can circumvent these limits, and there are some pdf readers available – ChatPdf, for example – which can take whole pdfs and then allow users to ‘chat’ with them through ChatGPT.

  1. Legal research

LLMs are able to answer legal research questions that go beyond summarizing a specific, known filing. These prompts tend to yield more accurate information for simpler, widely discussed topics (‘what are the elements of joint criminal enterprise?’) than more arcane ones (‘when is it necessary to disclose witness expenses at the ICTY?’). Dialogue is often key in getting a meaningful answer, as it may be necessary to ask the LLM to ‘develop the part of your answer on [discrete topic]’ to get more information on what you are actually interested in.

Obstacle for further growth? Hallucinations. LLMs currently have a tendency to confidently spit out inaccurate citations and even fake cases. Not checking the results can lead to disastrous consequences. It is critical to think of LLMs as a complement to diligent practice, and not as a substitute for it. One other smaller area for growth in the legal research field is that ChatGPT is only trained on the internet’s data as of the end of 2021, so it does not currently have access to the most recent case law (note Bing Chat does not have the same issue).

  1. Drafting and editing

When properly prompted with fact patterns and applicable law, LLMs can actually generate draft paragraphs which could serve as a starting point for ICL drafting. They can also help to refine already drafted text to improve clarity and readability. Deliberate specificity as to style may assist in this regard – asking the LLM to write a part of a legal brief in the style of ‘William Schabas’ or ‘an ICC judge’ may yield better results than asking for something to be drafted more generally.

Obstacle for further growth? The extent of revisions. The prevailing LLMs available now cannot generate passages with references that could be placed in a draft without further alteration. Proposed changes in word choice can cause unintended errors when particular precision is required, and LLMs may not always be able to provide a clear quotation as to why they wrote the sentences they did (this itself may be a signal the LLM is hallucinating in its answer).

Within the constraints of confidentiality and specific institutional procedures, it may be more beneficial for ICL lawyers to use the present drafting capabilities of LLMs as inspiration before drafting an initial document, or as a tool to refine a largely finished one. Regardless of how LLMs are incorporated into written work, the passages generated cannot simply be copied/pasted into broader work without further checking.

  1. Disclosure and evidentiary collection research

LLMs trained on large evidentiary collections can provide a wealth of information about the collection. LLMs could be prompted with information from pending disclosure requests or specific facts relevant to ICL trials. They could also facilitate the capture of evidence in the course of trial, such as summarising witness testimony or isolating key details.

Obstacle for further growth? Confidentiality. The risk of errors in current technology makes it impossible to put sensitive information into publicly available LLMs. This means that solutions need to be developed for confidential ICL data collections that inspire sufficient confidence to allow them to actually be used. There are many tech startups which develop such tools – Casetext’s new Co-Counsel tool is an example (and can do many things besides e-discovery) – but ICL institutions may need to have custom-made solutions to get widespread adoption. The new ICC OTP’s new evidence submission initiative – complete with reliance on AI and machine learning – is a welcome development in this regard.

  1. Legal analytics

LLMs’ ability to answer highly specific questions in the face of massive data could yield some very interesting ‘scouting reports’ on how ICL litigators could react to certain litigation strategies. A particular judge’s rulings on disclosure violations, how a particular Defence attorney cross-examined insider witnesses in past trials, what sentencing ranges would a Trial Chamber be willing to accept for a given offence – all these kinds of data projections are technically possible with the right data sets and prompting.

Obstacle to further growth? Small samples. There simply are not that many ICL trials, which may substantially limit the ability of legal analytics to make meaningful predictions. There is also frequent turnover – ICC judges are elected to non-renewable nine-year terms, for instance, making it highly likely that judges hearing a given trial are hearing their first trial for the institution. ICL legal analytics may end up being more useful for repetitive elements of trials that can lead to larger datasets faster – like a record of a presiding judge on questioning objections, for example – than rarer events for which the data set will be inherently small.


Whatever limits to their present use, the potential of LLMs for ICL is unmistakeable. They are also evolving rapidly, making prediction of when or how they will gain widespread adoption within the field difficult. But ICL would be well-served to explore how to maximise these tools and ensure their responsible use.

Photo: UN International Residual Mechanism for Criminal Tribunals, Arusha, Tanzania (Roman Boed, 2019)

Print Friendly, PDF & Email

Leave a Comment

Comments for this post are closed