… or, A Story of How a Coding
C*Screw-Up Made Bangladesh One of the Least Tolerant Countries in the World. (Spoiler: It isn’t!)
What We Thought We Knew
Yesterday, the Washington Post put out a story, A fascinating map of the world’s most and least racially tolerant countries. In that map, India and Bangladesh stuck out like a baboon’s butt as bastions of intolerance in the world. It’s reproduced below; red implies that more people said they “would not want neighbors of a different race”.
In fact, the percentages are more than a little damning:
“India, Jordan, Bangladesh and Hong Kong by far the least tolerant. In only three of 81 surveyed countries, more than 40 percent of respondents said they would not want a neighbor of a different race. This included 43.5 percent of Indians, 51.4 percent of Jordanians and an astonishingly high 71.8 percent of Hong Kongers and 71.7 percent of Bangladeshis.”
Three thoughts occurred to me in this order:
- Wow that’s an odd basket of countries to be lumped together as the most intolerant!
- Ouch! Yeah I’m Bangladeshi.. and while I’ll be the first to admit we have our own favorite national stereotypes and periods of ethnographically inspired excitement, least tolerant? Really?
- I wonder if someone fat fingered on this big time.
Thanks to the fact that both Max Fisher of WaPo and World Values Survey folks freely shared their sources, we can take a dive into the data that generated the map to explore thought #3 to our heart’s content.
The short answer is, yes, someone did fat finger this big time. “Yes” and “No” got swapped in the second round of the survey, which means that 28.3% of Bangladeshis said they wouldn’t want neighbors of a different race – not 71.7%.
26K Facebook likers and 2.5K Tweeters, take note.
Now, the long version for the data wonks amongst you. By the way this piece is restricted to Bangladesh – time, and ability to read primary questionnaire being main constraints.
What the WVS Data Really Says
(Spoiler: Data says it’s confused…)
There were 5 waves of data collection:
Bangladesh has data for the third and fourth wave.
First, lets reproduce the 71.7% number. We can use the interactive query tool WVS has set up on their website at: http://www.wvsevsdb.com/wvs/WVSIntegratedEVSWVSvariables.jsp?Idioma=I
To get this:
“Mentioned” basically means “Yes” (we’ll explore this in detail in a bit).
What about the 1996 survey? Here’s the same page with 1996 data (table shown only):
If the oddity isn’t jumping out to you yet, let’s use something a little more visually friendly to compare the 1996 and 2002 numbers. WVS also makes available a Online Data Analysis toolkit at http://www.wvsevsdb.com/wvs/WVSAnalizeQuestion.jsp. With a literal tinkering, we can get to this screen:
“Mentioned” (the measure for less tolerance) went from 17.3% to 71.7%, while “Not Mentioned” went from 82.7% to 28.3%.
a) something WTHBBQPWN&%$#* happened in those 6 years that made 54.4% Bangladeshis do a 180 degree on their tolerance levels, OR
b) the coding got messed up and “Mentioned” is either 82.7% and 71.7% in 1996 and 2002 respectively, or 17.3% and 28.3% respectively.
For a survey size of N = 1,500ish, b) is always your safer bet when no obvious change agent is involved, endogenous or exogenous.
(If you’re Bangladeshi, you’re also probably laughing your behind off at a) since the 54.4% number involves tens of millions of people in a generally syncretistic society where appreciation for that heritage has arguably only increased with the younger generations, but that’s “anecdotal” for the purposes of this piece, so we’ll leave that thought there.)
If your data analysis foo is up to the task, I’d encourage you to check out the raw data itself to confirm that this is also the case there. The dataset is generously available at: http://www.wvsevsdb.com/wvs/WVSData.jsp
I used the STATA file for the “WVS FIVE WAVE AGGREGATED FILE 1981-2005″ dataset; you can also choose SPSS or SAS formats if they suit your toolkit better. For Stata, the relevant command is:
tab S002 A124_02 if S003 == 50
Giving the result:
| Neighbours: People of
| a different race
Wave | Not menti Mentioned | Total
1994-1999 | 1,261 264 | 1,525
1999-2004 | 425 1,075 | 1,500
Total | 1,686 1,339 | 3,025
(S002 is the Wave, A124_02 is the question under study, and S003 contains the country, with Bangladesh being code 50.)
What Should The Data Say?
(Spoiler: Bangladeshis are a tolerant bunch – it’s ok to come visit.)
Ok, now that we are reasonably certain the data is confused, which way will it point once we de-confuse it? For this, we need to turn to the actual questionnaires. These too are available at http://www.wvsevsdb.com/wvs/WVSDocumentation.jsp?Idioma=I.
First, the 1996 survey. The relevant section is reproduced below (Bangladesh_WVS_1996_1.pdf, pg 10):
The top line is the question, basically asking who the respondent “does not want or would not like to have as a neighbor”.
The first column, উল্লেখ করেছেন, means “Mentioned” – to be selected if the respondent notes a particular group of individuals (V51 – V60) are unwelcome neighbors.
The second column, উল্লেখ করেন নি, means “Not Mentioned” – this is for the more chillaxed bunch.
Now, let’s look at the 2002 survey. The relevant section is reproduced below (Bangladesh_WVS_2002_1.pdf, pg 19):
The top line has the same question as above, verbatim. The first or second column headers are not the same though.
The first column, প্রতিবেশী হিসেবে পছন্দ করবো, means “Would like [X] as a neighbor” – to be selected if the respondent notes a particular group of individuals (V68 – V77) are welcome neighbors.
The second column, প্রতিবেশী হিসেবে পছন্দ করবো না, means “Would not like [X] as a neighbor” – this is for the less chillaxed bunch.
‘1’ and ‘2’ stand for totally the opposite things in the two surveys.
Unless we are willing to allow that the data input folks consciously converted a ‘2’ in 2002 to a ‘1’ in 1996 to connect প্রতিবেশী হিসেবে পছন্দ করবো না to উল্লেখ করেছেন, I think it’s reasonable to assume that ‘1’ was also coded for “Mentioned”/উল্লেখ করেছেন in the 2002 dataset, leading to a flip in the results for that question for that wave. Spot checks also suggest that this is what would be consistent with surveys from other countries.
With everything righted the right way, here then is what the final numbers look like:
Yes, 28.3%, not 71.7%.
- I only looked at the question that was used for the tolerance/intolerance WaPo piece. This inconsistency shouldn’t be extrapolated to any part of the rest of the Bangladesh survey unless one has double checked for that.
- I ran this for the Bangladesh dataset only. No idea if any of this applies to any other country dataset – same caution as above applies.