… or, A Story of How a Coding
C*Screw-Up Made Bangladesh One of the Least Tolerant Countries in the World. (Spoiler: It isn’t!)
What We Thought We Knew
Yesterday, the Washington Post put out a story, A fascinating map of the world’s most and least racially tolerant countries. In that map, India and Bangladesh stuck out like a baboon’s butt as bastions of intolerance in the world. It’s reproduced below; red implies that more people said they “would not want neighbors of a different race”.
In fact, the percentages are more than a little damning:
“India, Jordan, Bangladesh and Hong Kong by far the least tolerant. In only three of 81 surveyed countries, more than 40 percent of respondents said they would not want a neighbor of a different race. This included 43.5 percent of Indians, 51.4 percent of Jordanians and an astonishingly high 71.8 percent of Hong Kongers and 71.7 percent of Bangladeshis.”
Three thoughts occurred to me in this order:
- Wow that’s an odd basket of countries to be lumped together as the most intolerant!
- Ouch! Yeah I’m Bangladeshi.. and while I’ll be the first to admit we have our own favorite national stereotypes and periods of ethnographically inspired excitement, least tolerant? Really?
- I wonder if someone fat fingered on this big time.
Thanks to the fact that both Max Fisher of WaPo and World Values Survey folks freely shared their sources, we can take a dive into the data that generated the map to explore thought #3 to our heart’s content.
The short answer is, yes, someone did fat finger this big time. “Yes” and “No” got swapped in the second round of the survey, which means that 28.3% of Bangladeshis said they wouldn’t want neighbors of a different race – not 71.7%.
26K Facebook likers and 2.5K Tweeters, take note.
Now, the long version for the data wonks amongst you. By the way this piece is restricted to Bangladesh – time, and ability to read primary questionnaire being main constraints.
What the WVS Data Really Says
(Spoiler: Data says it’s confused…)
There were 5 waves of data collection:
Bangladesh has data for the third and fourth wave.
First, lets reproduce the 71.7% number. We can use the interactive query tool WVS has set up on their website at: http://www.wvsevsdb.com/wvs/WVSIntegratedEVSWVSvariables.jsp?Idioma=I
“Mentioned” basically means “Yes” (we’ll explore this in detail in a bit).
What about the 1996 survey? Here’s the same page with 1996 data (table shown only):
If the oddity isn’t jumping out to you yet, let’s use something a little more visually friendly to compare the 1996 and 2002 numbers. WVS also makes available a Online Data Analysis toolkit at http://www.wvsevsdb.com/wvs/WVSAnalizeQuestion.jsp. With a literal tinkering, we can get to this screen:
“Mentioned” (the measure for less tolerance) went from 17.3% to 71.7%, while “Not Mentioned” went from 82.7% to 28.3%.
a) something WTHBBQPWN&%$#* happened in those 6 years that made 54.4% Bangladeshis do a 180 degree on their tolerance levels, OR
b) the coding got messed up and “Mentioned” is either 82.7% and 71.7% in 1996 and 2002 respectively, or 17.3% and 28.3% respectively.
For a survey size of N = 1,500ish, b) is always your safer bet when no obvious change agent is involved, endogenous or exogenous.
(If you’re Bangladeshi, you’re also probably laughing your behind off at a) since the 54.4% number involves tens of millions of people in a generally syncretistic society where appreciation for that heritage has arguably only increased with the younger generations, but that’s “anecdotal” for the purposes of this piece, so we’ll leave that thought there.)
If your data analysis foo is up to the task, I’d encourage you to check out the raw data itself to confirm that this is also the case there. The dataset is generously available at: http://www.wvsevsdb.com/wvs/WVSData.jsp
I used the STATA file for the “WVS FIVE WAVE AGGREGATED FILE 1981-2005″ dataset; you can also choose SPSS or SAS formats if they suit your toolkit better. For Stata, the relevant command is:
tab S002 A124_02 if S003 == 50
Giving the result:
| Neighbours: People of | a different race Wave | Not menti Mentioned | Total --------------------+----------------------+---------- 1994-1999 | 1,261 264 | 1,525 1999-2004 | 425 1,075 | 1,500 --------------------+----------------------+---------- Total | 1,686 1,339 | 3,025
(S002 is the Wave, A124_02 is the question under study, and S003 contains the country, with Bangladesh being code 50.)
What Should The Data Say?
(Spoiler: Bangladeshis are a tolerant bunch – it’s ok to come visit.)
Ok, now that we are reasonably certain the data is confused, which way will it point once we de-confuse it? For this, we need to turn to the actual questionnaires. These too are available at http://www.wvsevsdb.com/wvs/WVSDocumentation.jsp?Idioma=I.
First, the 1996 survey. The relevant section is reproduced below (Bangladesh_WVS_1996_1.pdf, pg 10):
The first column, উল্লেখ করেছেন, means “Mentioned” – to be selected if the respondent notes a particular group of individuals (V51 – V60) are unwelcome neighbors.
The second column, উল্লেখ করেন নি, means “Not Mentioned” – this is for the more chillaxed bunch.
Now, let’s look at the 2002 survey. The relevant section is reproduced below (Bangladesh_WVS_2002_1.pdf, pg 19):
The first column, প্রতিবেশী হিসেবে পছন্দ করবো, means “Would like [X] as a neighbor” – to be selected if the respondent notes a particular group of individuals (V68 – V77) are welcome neighbors.
The second column, প্রতিবেশী হিসেবে পছন্দ করবো না, means “Would not like [X] as a neighbor” – this is for the less chillaxed bunch.
’1′ and ’2′ stand for totally the opposite things in the two surveys.
Unless we are willing to allow that the data input folks consciously converted a ’2′ in 2002 to a ’1′ in 1996 to connect প্রতিবেশী হিসেবে পছন্দ করবো না to উল্লেখ করেছেন, I think it’s reasonable to assume that ’1′ was also coded for “Mentioned”/উল্লেখ করেছেন in the 2002 dataset, leading to a flip in the results for that question for that wave. Spot checks also suggest that this is what would be consistent with surveys from other countries.
With everything righted the right way, here then is what the final numbers look like:
Yes, 28.3%, not 71.7%.
- I only looked at the question that was used for the tolerance/intolerance WaPo piece. This inconsistency shouldn’t be extrapolated to any part of the rest of the Bangladesh survey unless one has double checked for that.
- I ran this for the Bangladesh dataset only. No idea if any of this applies to any other country dataset – same caution as above applies.
Who knew dissertation proposals took so long to pump out … Will blame the lull in blogging on that, though there’s been a lot going on in the background that I’d love to share “soon” on the blog.
In the meantime, let me just put this interview out there. Jonathan Morduch noted in his Household Savings in Developing Economies: An Annotated Reading List in 2008 that there are three main types of contributors to the literature on microfinance: 1) academic economists, 2) practitioners, and 3) anthropologists and sociologists. Lamia Karim is an anthropologist who has a new book out called Microfinance and Its Discontents:Women in Debt in Bangladesh, and am waiting eagerly to get my hands on it via Illyiad.
Btw, spoiler alert: she’s part of the “microcredit is bad for poor people, mmkay?” crowd. Don’t let that detract from her message though – this stuff is worth keeping in mind so that we don’t have to deal with another AP-style disaster again.