After doing survey research in Iraq for nearly two years, I was surprised to read that a study by a group from Johns Hopkins University claims that 655,000 Iraqis have died as a result of the war. Don't get me wrong, there have been far too many deaths in Iraq by anyone's measure; some of them have been friends of mine. But the Johns Hopkins tally is wildly at odds with any numbers I have seen in that country. Survey results frequently have a margin of error of plus or minus 3% or 5%--not 1200%.
The group--associated with the Johns Hopkins Bloomberg School of Public Health--employed cluster sampling for in-person interviews, which is the methodology that I and most researchers use in developing countries. Here, in the U.S., opinion surveys often use telephone polls, selecting individuals at random. But for a country lacking in telephone penetration, door-to-door interviews are required: Neighborhoods are selected at random, and then individuals are selected at random in "clusters" within each neighborhood for door-to-door interviews. Without cluster sampling, the expense and time associated with travel would make in-person interviewing virtually impossible.
However, the key to the validity of cluster sampling is to use enough cluster points. In their 2006 report, "Mortality after the 2003 invasion of Iraq: a cross-sectional sample survey," the Johns Hopkins team says it used 47 cluster points for their sample of 1,849 interviews. This is astonishing: I wouldn't survey a junior high school, no less an entire country, using only 47 cluster points.
Neither would anyone else. For its 2004 survey of Iraq, the United Nations Development Program (UNDP) used 2,200 cluster points of 10 interviews each for a total sample of 21,688. True, interviews are expensive and not everyone has the U.N.'s bank account. However, even for a similarly sized sample, that is an extraordinarily small number of cluster points. A 2005 survey conducted by ABC News, Time magazine, the BBC, NHK and Der Spiegel used 135 cluster points with a sample size of 1,711--almost three times that of the Johns Hopkins team for 93% of the sample size.
What happens when you don't use enough cluster points in a survey? You get crazy results when compared to a known quantity, or a survey with more cluster points. There was a perfect example of this two years ago. The UNDP's survey, in April and May 2004, estimated between 18,000 and 29,000 Iraqi civilian deaths due to the war. This survey was conducted four months prior to another, earlier study by the Johns Hopkins team, which used 33 cluster points and estimated between 69,000 and 155,000 civilian deaths--four to five times as high as the UNDP survey, which used 66 times the cluster points.
The 2004 survey by the Johns Hopkins group was itself methodologically suspect--and the one they just published even more so.
Curious about the kind of people who would have the chutzpah to claim to a national audience that this kind of research was methodologically sound, I contacted Johns Hopkins University and was referred to Les Roberts, one of the primary authors of the study. Dr. Roberts defended his 47 cluster points, saying that this was standard. I'm not sure whose standards these are.
Appendix A of the Johns Hopkins survey, for example, cites several other studies of mortality in war zones, and uses the citations to validate the group's use of cluster sampling. One study is by the International Rescue Committee in the Democratic Republic of Congo, which used 750 cluster points. Harvard's School of Public Health, in a 1992 survey of Iraq, used 271 cluster points. Another study in Kosovo cites the use of 50 cluster points, but this was for a population of just 1.6 million, compared to Iraq's 27 million.
When I pointed out these numbers to Dr. Roberts, he said that the appendices were written by a student and should be ignored. Which led me to wonder what other sections of the survey should be ignored.
With so few cluster points, it is highly unlikely the Johns Hopkins survey is representative of the population in Iraq. However, there is a definitive method of establishing if it is. Recording the gender, age, education and other demographic characteristics of the respondents allows a researcher to compare his survey results to a known demographic instrument, such as a census.
Dr. Roberts said that his team's surveyors did not ask demographic questions. I was so surprised to hear this that I emailed him later in the day to ask a second time if his team asked demographic questions and compared the results to the 1997 Iraqi census. Dr. Roberts replied that he had not even looked at the Iraqi census.
And so, while the gender and the age of the deceased were recorded in the 2006 Johns Hopkins study, nobody, according to Dr. Roberts, recorded demographic information for the living survey respondents. This would be the first survey I have looked at in my 15 years of looking that did not ask demographic questions of its respondents. But don't take my word for it--try using Google to find a survey that does not ask demographic questions.
Without demographic information to assure a representative sample, there is no way anyone can prove--or disprove--that the Johns Hopkins estimate of Iraqi civilian deaths is accurate.
Public-policy decisions based on this survey will impact millions of Iraqis and hundreds of thousands of Americans. It's important that voters and policy makers have accurate information. When the question matters this much, it is worth taking the time to get the answer right.
Source: WSJ