Insufficient Evidence

Back in March judges at the International Criminal Court refused to support chief prosecutor Luis Moreno-Ocampo’s call for Sudan President Omar al-Bashir to be indicted on charges of genocide. They said there was insufficient evidence. On Monday Ocampo lodged an appeal against the ruling, but as far as one can tell submitted no new evidence.
The problem of evidence, what to gather, how to gather it and how to interpret it is going to increasingly plague the humanitarian business. The crisis in the making is similar to that which plagued the medical profession up to the 1970s. Until then, although there were clinical trials and massive community wide surveys, there was no profession-wide recognition of the need for a rigorous understanding of what constituted good evidence, or the need for best practice to be based on evidence. Until then, trial and error, experience, wisdom passed on from the old hands and write-ups of practices that gave positive results constituted the bulk of the body that guided medical practice.
And that pretty much describes where humanitarian assistance is today.
For medicine, Archie Cochrane’s 1972 book Effectiveness and Efficiency: Random Reflections on Health Services, captured the massive change in approach which now dominate medicine. The Cochrane Collaboration, named after him, defines the evidence approach thus: “Evidence-based health care is the conscientious use of current best evidence in making decisions about the care of individual patients or the delivery of health services. Current best evidence is up-to-date information from relevant, valid research about the effects of different forms of health care, the potential for harm from exposure to particular agents, the accuracy of diagnostic tests, and the predictive power of prognostic factors.”
In The U.S., the Preventive Services Task Force (USPSTF) has used this approach to come up with two scales, one for rating the value of evidence and the other for rating the worth of a clinical service. Again, think of the parallels (or maybe absence of them) with humanitarian service.
On the USPSTF evidence scale, level I evidence is the best, level III the least reliable.
Level I: Evidence obtained from at least one properly designed randomized controlled trial.
Level II-1: Evidence obtained from well-designed controlled trials without randomization.
Level II-2: Evidence obtained from well-designed cohort or case-control analytic studies, preferably from more than one center or research group.
Level II-3: Evidence obtained from multiple time series with or without the intervention. Dramatic results in uncontrolled trials might also be regarded as this type of evidence.
Level III: Opinions of respected authorities, based on clinical experience, descriptive studies, or reports of expert committees.
If we are honest, where do most humanitarian studies sit? Level three and occasionally the bottom end of level two?
When it comes to clinical practice, or in our parlance relief interventions, it’s all about the balance between knowable benefits and probably risks. The USPSTF provides a five point scale to grade “magnitude of net benefit” .
The humanitarian profession, sadly, is a million miles away from this sort of systematic international discipline, and yet we too deal in life and death issues.
Here are the top five sins of insufficient evidence we regularly practice.
The sin of induction. We can’t help it. We view the present by comparing it to the past. We look for pattern in the past and assume similar pattern in the present means similar things. The science philosopher Karl Popper did a great job of showing the pernicious nature of this natural way of thinking. (You can read some of his original writings on this at We then compound the sin by refusing to let go of it!
So, after the US invasion of Afghanistan, it looked like a post-conflict situation, so it got modeled thus, and the model was clung to, even though today, if you had known nothing about the past and were looking at Afghanistan with impartial eyes, it would patently be a country caught up in war. But we cling to the model and so program for post-conflict, channeling most of our aid in support of the central government, because that’s what you do post conflict to rebuild. Yet, to the skeptic, it looks like the aid business (supposed to be impartial and neutral) has chosen to back one side in the conflict.
It’s the same in Darfur (Aggressive Arab war crime committers pitted against victimized African farmers.) That’s the dominant model yet all our research shows this is a wild distortion and only a small part of what is going on.
In Nepal, the development community, Kathmandu based, programmed as though there was no conflict in the country.

conclusion one:
The initial models upon which we base our programming assumptions are often wrong, and even though they are wrong they are desperately hard to shake off.
The sin of “all other things being equal”. Economists use this really irritating turn of phrase when they want to prove the predictive power of their models. The problem of course is that reality does not behave. All other things are not equal. “If funding is not provided/aid agencies do not have access/ people are not allowed to move/medical supplies are not let through, then XXX,XXX people are in danger of dying.” These sorts of predictions are only true if nothing else changes. In reality though the predictions rarely come true, because people take steps to help themselves, others step in, local communities or agencies provide help. I.e. the international aid machine is not the font of all salvation.
Conclusion two: Crises are dynamic. Predictions based on aid agencies centrality are of little value.

The sin of cherry picking:
This is a well intentioned but distorting sin. We tend to quote the data that supports our case. Human rights reports cite incidents of human rights violations, a sub-set of the sample as it were. What about the incidence of human rights being upheld or not violated? We selectively quote the case studies and reports that support the case we want to build, thus confusing advocacy and evidence. We over report our successes and under report failure.
Conclusion three: Advocacy becomes self deluding. We start to confuse our advocates world with reality, distorting urgency, scale, depth and horror of crisis.
The sin of classification: We like cutoff points to help simplify our decision making, but the sin is believing that cutoff point matters. Mortality rates of more than 1/10,000/day mean you have a crisis on your hands, but why 1/10,000, why not 2 or 0.5? In reality the rate is an artificial cut off. In nutrition, difference organizations propose different cut offs for severe and acute malnutrition. And the interpretation of those malnutrition rates is highly dependent on the methodology used to measure malnutrition.
Conclusion four: Our fixing of and interpretation of cut off points, severity scales and standards, belies our lack of true understanding of how the complex processes at work in a crisis actually effect survive chances.

The sin of causality.
This comes in two versions. In the fist we interpret statistical correlations as proven cause and effect. In the second we assume that cause and effect follow a simple relationship.
Economists are raving about eh beneficial effects cell phone ownership has on economic growth in the south. As cell phone penetration into the market goes up, so to does GDP. So what, which causes which or are they both caused by deeper underlying processes. There is a wonderful negative correlation, until the early 2000s between the number of pirates operating on the high seas and the average global temperature, but this does not mean countering pirates increases global warming.
Secondly, we like simple models. The more malnourished a person is the more at risk of death they are. Sounds plausible, but as Helen Young and Suzanne Jaspars have shown, its more complex than that. There is actually far les correlation than people expect because malnutrition is the end point of so many different livelihood and public health scenarios.
Conclusion five: Beware correlations, always seek to prove causality.
In sum, our hypotheses and conceptual models of what happens in a crisis are still far to poorly developed to allow us to be truly evidence based. As a result of this we do not know enough yet about what to measure, to diagnose, and proscribe in a crisis. We do not know how best to measure and we do not know how to fully interpret these measurements.
In short, humanitarian assistance needs its own evidence based revolution, just as medicine did. The good news is that, as Archie Cochrane and his successors showed, it can be done and once adopted it makes a significant difference to the outcomes.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>