The aim of this post is to start a conversation about unusual evidentiary standards emerging in some judgments at the ICC. Although the underlying impetus is commendable, these standards pose legally unprecedented and epistemologically unsound demands. Remarkably, these novel evidentiary approaches, which depart significantly from national and international practice, have not yet triggered much conversation. As recent cases (such as Gbagbo) have ended in acquittals, the Court-watching community has largely simply echoed the judicial criticisms of the evidence, and hence blamed inadequate investigations. While investigative improvements are likely part of the solution, any serious effort to repair the ICC has to consider these evidentiary standards. These standards will significantly increase the costs and delays of ICC proceedings. In cases of any complexity, the standards can only result in failed cases. An invigorated sub-discipline – international criminal evidence law – is urgently needed.
In this three-part series of posts, I will focus on the Gbagbo acquittal judgment. Douglas Guilfoyle’s thoughtful ‘tale of two cases’ advances a hypothesis that the different outcome between the Gbagbo acquittal and Ntaganda conviction is because the latter focused on an easier, smaller case. That may be true, but I want to place alongside that another hypothesis, that the difference between the two outcomes may in part be the very different approaches by the judges.
I open with a word of sympathy for judges. At an earlier stage of international criminal law, Tribunal judges were often criticized by academics (including me) for adopting approaches that were too pro-conviction and that overlooked rights of the accused. Hence it is entirely understandable that judges and legal officers may have lurched in the other direction, with an eagerness to demonstrate their unparalleled care for the accused.
The problem is when the zeal for impeccable standards swings too far, and produces a method that is so rigid, formalistic, and hypersceptical that it loses sight of substance and feasibility. To briefly summarize some features of this approach (I will explain these features in more detail in the posts to follow):
- A “hypersceptical” approach to potentially incriminating evidence, that looks at each item in isolation, scrutinizing it for any possible reason to disbelieve or downplay it. This includes freely inventing ‘alternative narratives’ for each item, even without any evidentiary support. By contrast, the more standard approach is to assess evidence even-handedly, considering factors that undermine or support the evidence, and then to apply the ‘beyond reasonable doubt’ standard to the totality of the evidence.
- Speculative doubt: Evidence is discredited due to mere conjectured “possibilities”, without testing they amount to “reasonable” doubts. Alternative narratives are not subjected to critical assessment of their implausibility. Most importantly, there is no assessment of the cumulative implausibility of all of the different exonerating theories invented for each piece of evidence.
- Credulity to exonerating evidence: The hyperscepticism to incriminating evidence is combined with uncritical credulity toward potential exonerating evidence. This further aggravates the lack of even-handedness, and departs even further from the normal judicial approach.
- Pointillistic corroboration: A novel, narrow conception of corroboration that is legally and epistemologically unsound. It focuses on fine details and fails to apply standard reasoning tools for assessing patterns.
- Rigid formalism over substance: Analytical categories are applied with a rigidity beyond any national system. By contrast, a more standard approach considers context that may make a particular piece of evidence highly reliable.
- Authentication: Authentication standards require near-certainty and entail unsound expectations about documents (for example, that documents should be signed, dated, and stamped).
- Fastidiousness (consistency): Evidence is rejected because of minor inconsistencies that are commonplace in human recollection.
- Fastidiousness (granularity): The approach insists on a level of clinical tidiness and precision that may be unattainable even from an omniscient perspective. Such expectations are especially inappropriate for mass atrocity, which is not necessarily tidily logical.
- Novel barriers for crimes against humanity: Requiring proof of perpetrator motives, perceiving rape as opportunistic and unconnected to surrounding violations, and inquiring whether all members of an organization carry out inhumane acts at every opportunity.
For brevity’s sake, I will label this approach as the “Cartesian” approach, because of its hyperscepticality, its desire for certainty, and its zest for what seems to be logical rigour.
The allegations and evidence in Gbagbo
My post differs from Douglas Guilfoyle’s, because he takes the Trial Chamber majority’s criticisms of the evidence at face value, whereas I am posing a different question: whether the majority’s criticisms are at least partly rooted in unprecedented and problematic understandings about evidence.
In Gbagbo, the Prosecution presented 4610 items of documentary evidence and 96 witnesses. The evidence included, inter alia, eye-witnesses, insider testimony, expert witnesses, documents, videos, photos, audio recordings, physical exhibits, and forensic evidence. The evidence concerned hundreds of instances of killings, beatings, rapes, torture, and burning persons alive, carried out by pro-Gbagbo forces against civilians perceived to be supporters of Gbagbo’s political rival. A lot of the evidence focused attacks on unarmed demonstrators. This included the use of grenades against protestors.
In the Trial Chamber, the majority (Judges Henderson and Tarfusser) held that there was “no case to answer”, whereas Judge Carbuccia in dissent thought there was sufficient evidence to proceed. In my view, the crucial factor in the divergence between the judges is their approach to evidence. To illustrate the points of divergence, I can highlight some features of Judge Carbuccia’s approach (which in my view is more consistent with national and international practice):
- we are concerned not only with rights of the accused, but also with justice and truth; in assessing evidence we should remember the objective is to get at truth (§6 & 7);
- she does not engage in ‘alternative narratives’ that have no substantiation or basis in the record (§49);
- evidence should not be assessed in a vacuum, but rather should be assessed in light of other evidence and using human experience; evidence can be partly believed and partly disbelieved (§29 & 49);
- as long as there are sufficient guarantees of impartiality, UN and NGO reports can provide reliable information; they can be used to corroborate other evidence or to give more understanding about broader circumstances (§31);
- minor inconsistencies do not automatically render testimony unreliable; circumstances such as time lapse and trauma can be considered (§34);
- for ‘widespread’ crimes against humanity, we cannot expect forensic standards like in small-scale national cases (§38); for example, forensic evidence of 700 bodies is relevant to proof of ‘widespread’, even if not every victim is identified by name (§346).
I do not take a position here on the correct final outcome in the Gbagbo case. I do however think that the majority’s evidentiary expectations were novel and problematic, and my aim is to draw attention to these. The majority judgment is at its best when it argues that the Prosecution case gave inadequate attention to the fact that President Gbagbo faced not only protestors but also armed opposition. Nonetheless, the majority rejected thousands of items of evidence because of excessive standards and speculative doubts that seem to me to be errors of law, which in a national system would easily warrant correction.
I am addressing over 1300 pages of judgment in a few thousand words, and hence I can only outline broad issues, without delving into nuances, and I must skip over countless examples and issues. Except where otherwise indicated, references below are to the Henderson judgment, which gave the detailed reasoning for the majority. My comments here are preliminary, pending more careful study of evidentiary approaches in different systems. My aim is to start a conversation about a possible problem.
Hyperscepticism and alternative narratives
The diverging approaches to evidence among ICC judges has been framed as one between ‘atomism’ (looking at each piece in isolation) and ‘holism’ (looking at all the evidence together). I think this framing, while largely sound, slightly misses the true controversy. I think the real objection to ‘atomism’ at the ICC is that it is coupled with a “hypersceptical” approach.
“Hyperscepticism” is the opposite error of credulity. It is an excess of unwillingness to be convinced of facts, even when the evidence is adequate to provide the requisite level of confidence. The hypersceptical atomistic approach has started appearing in some ICC decisions, but I have not yet seen it anywhere else. Even judges who apply the hypersceptical atomistic approach at the ICC did not seem to do so before they were at the ICC. It is an anomaly peculiar to ICC culture.
First, the hypersceptical atomistic approach is one-sided: it scours each individual piece of potentially incriminating evidence to find any possible reason to doubt each item. Unlike in the more standard (even-handed) approach, it gives almost no attention to considerations that might bolster the evidence. Furthermore, the judge can freely invent “alternative narratives” for each piece of evidence in isolation, to give each one a non-inculpatory explanation. The result makes for remarkable and distinctive reading: a judgment that postulates exonerative hypotheses for item after item of evidence, for hundreds of pages.
Second, the “alternative narratives” can be purely speculative and hypothetical; they do not need any support in the evidentiary record. The Gbagbo majority judgment is rife with examples: eg. “such a possibility cannot be ruled out”, “possibility cannot be excluded”, “allow for an alternative reading”, “not exclude the possibility”, “it is possible” (eg §411, 1140, 1171). The judgment repeatedly refers to a lack of “certainty” (eg §35, 99, 1636) even though certainty is not the standard. Hundreds of items of evidence were set aside because of speculations, without assessing whether those speculations generate a reasonable doubt. In national systems, this alone would be an error of law.
The third, and gravest problem, is the failure to look at the cumulative implausibility of the alternative explanations developed for each individual item. It is easy to have a reasonable doubt about one item in isolation. But when we look at all of the evidence, then we see that the improbable stories developed for each item in isolation grow cumulatively increasingly improbable, vanishing well beyond the point of reasonable doubt.
Maybe the police who killed civilians in one incident just ‘panicked’ (Tarfusser at §80). Maybe the killings, beatings and detentions in other incidents were by rogue officers (§1406, §1583). Maybe the killing in another incident was because of a personal grudge (§1417). Maybe in another incident, when police were seen killing civilians, they were actually killing combatants dressed as civilians (§1322). Maybe the civilians killed in another incident died from “stray bullets” or because of unrelated “crimes taking place in the midst of the chaos” (§1557). Maybe the civilians in another incident, last seen under fire from police, and later found dead from bullet wounds, were killed in the interim by hypothetical and unconnected persons unknown (§1746). Maybe, in another incident, contrary to ballistics evidence, the people died from “ricocheting bullets” (§1777). Maybe the Malian man was burned to death for unknown reasons completely unrelated to the wave of burning-to-deaths of foreigners (§1749). Maybe the multiple rapes of protestors were all (somehow?!) unconnected with the dehumanization and violence unleashed against those protestors (§1217). Maybe, contrary to available testimony, the mortar fire did not come from the nearby unit which was the only entity known to have mortars (§1820). Maybe the forces who looked like pro-Gbagbo forces, and who were attacking Ouatarra supporters, were actually persons unknown (a new faction unknown to all parties?) (§1154, 1394, 1580, 1613, 1934). Maybe the colonel who said he got orders from Gbagbo was just “embellishing” (§411). Maybe there was a diligent investigation by the state into these crimes, that we just don’t know about (§273). Such speculations continue even for every minor piece of supportive evidence. Maybe the official documents were tampered with by a hypothetical unknown intruder (§35). Maybe the page in the log between 14 December and 16 December was (bizarrely) not 15 December (§1140). And so on, for 4600 pieces of evidence.
This seems to me a highly unorthodox approach, inventing ‘maybes’ for each item of evidence. If you bring supporting evidence to address speculations, that evidence in turn is doubted with other speculations, so you can never gain a foothold to escape this pit. Even more problematically, however, the approach fails to consider the cumulative improbability of all the hypotheses that must be stacked together to sustain a non-incriminatory account. An extreme number of improbable suppositions rapidly ceases to constitute a ‘reasonable’ doubt.
One of the problems with judges indulging too much in speculations and alternative narratives problem is that the opposing party has no chance to rebut the alternative narratives, as they arise only in the judgment. Another problem is that the judge starts to inhabit the role of defence counsel.
A criminal law system cannot work if the decision-maker is engaged in creative and imaginative resistance to one side’s case, fighting it item-by-item and step-by-step. The standard (and sound) approach in criminal law strives for an impartial assessment, aiming to find the objective truth, by looking even-handedly at considerations that undermine or support both incriminating and exonerating evidence, and drawing reasonable inferences. The standard approach then favours the accused at the end, by applying the standard of ‘beyond reasonable doubt’ for each element of the offence.
Uncritical credulity toward potentially exonerating evidence
In stark contrast to its hyperscepticality toward potentially inculpatory evidence, the Cartesian approach is uncritically credulous toward exonerating evidence. Again, the underlying impulse is commendable: to give every benefit at every step to the accused. However, the reasonable doubt standard does not require such distortions at every step; the normal approach is an even-handed assessment, drawing sensible inferences, with the reasonable doubt standard at the end. The distorted approach sanctifies the interests of the accused but undermines the other aims of criminal law inquiry (truth, accountability, acknowledgement).
I will give one example of this credulity toward exonerating evidence. In regard to ‘insider’ testimony, the Prosecution argued that one should ‘treat with caution’ statements by a witness downplaying his or her own liability, because people have a tendency to self-exonerate. This is a pretty standard caution in criminal law, and also standard in critical source evaluation. Indeed, it is a familiar fact of life for anyone who has had any dealings with human beings: humans tend to present themselves in a favourable light. It is an empirically well-demonstrated tendency –it does not even necessarily reflect dishonesty, but can come from psychological effects such as cognitive dissonance. Accordingly, the Prosecution suggested that this should be borne in mind, especially where a self-serving claim departs from other evidence (which itself is also a staple of holistic assessment of evidence).
Judges Henderson and Tarfusser both employ scathing and indignant language chastising the Prosecution for this suggestion. They admonish the Prosecutor that “if there is anything ‘self-serving’ in this regard, it undoubtedly is the Prosecutor’s attempt to wish away essential parts of the evidence” (§1426). They retort that “the testimony is evidence that cannot be ignored” (§1233). But the proposal was not to ‘wish away’ or ‘ignore’ evidence. The proposal was that in assessing such evidence, one should bear in mind this well-established potential frailty in self-serving statements.
Judge Tarfusser condemned the Prosecution’s suggested caution as ‘hypocritical’ (§71) and ‘surprising’, and asked why, if that is the case, the witnesses were given legal assistance as witnesses (§72). The criticism makes sense if one believes that people are divided into Liars and Truth-Tellers, and that as witnesses we must only bring the Truth-Tellers. However, even under oath, human testimony can be more or less reliable on different subjects, for many reasons. Insiders can provide incredibly valuable information about internal operations, but a fact-finder must at least bear in mind well-known possible frailties when they address their own role. Judges are supposed to assess evidence, not just with abstract analytical rules, but bringing wisdom and experience to bear.
In my view, the majority’s unwillingness to even entertain this commonplace caution – and indeed their vehement rebuke of the idea – is an error of law and a symptom of the problematic excesses of the Cartesian approach.
I have only scratched the surface of the Cartesian approach. In the following posts I will outline some of the other problematic features: pointillistic corroboration, formality over substance, unorthodox authentication standards, intolerance of minor inconsistencies, and insistence on granular detail and fastidious clarity that reality might not provide. These tendencies, although flowing from a commendable impulse to uphold high standards, require urgent attention and analysis.