CEME Inclusive Commerce Blog Hosted by the Center for Emerging Market Enterprises (CEME), The Fletcher School

16May/13Off

Surveys Gone Bad – When “Yes” means “No”

… or, A Story of How a Coding C*Screw-Up Made Bangladesh One of the Least Tolerant Countries in the World. (Spoiler: It isn’t!)

What We Thought We Knew

Yesterday, the Washington Post put out a story, A fascinating map of the world’s most and least racially tolerant countries. In that map, India and Bangladesh stuck out like a baboon’s butt as bastions of intolerance in the world. It’s reproduced below; red implies that more people said they “would not want neighbors of a different race”.

wpost_tolerance_map_20130516

In fact, the percentages are more than a little damning:

India, Jordan, Bangladesh and Hong Kong by far the least tolerant. In only three of 81 surveyed countries, more than 40 percent of respondents said they would not want a neighbor of a different race. This included 43.5 percent of Indians, 51.4 percent of Jordanians and an astonishingly high 71.8 percent of Hong Kongers and 71.7 percent of Bangladeshis.”

Three thoughts occurred to me in this order:

  • Wow that’s an odd basket of countries to be lumped together as the most intolerant!
  • Ouch! Yeah I’m Bangladeshi.. and while I’ll be the first to admit we have our own favorite national stereotypes and periods of ethnographically inspired excitement, least tolerant? Really?
  • I wonder if someone fat fingered on this big time.

Thanks to the fact that both Max Fisher of WaPo and World Values Survey folks freely shared their sources, we can take a dive into the data that generated the map to explore thought #3 to our heart’s content.

The short answer is, yes, someone did fat finger this big time. “Yes” and “No” got swapped in the second round of the survey, which means that 28.3% of Bangladeshis said they wouldn’t want neighbors of a different race – not 71.7%.

26K Facebook likers and 2.5K Tweeters, take note.

Now, the long version for the data wonks amongst you. By the way this piece is restricted to Bangladesh – time, and ability to read primary questionnaire being main constraints.

What the WVS Data Really Says

(Spoiler: Data says it’s confused…)

There were 5 waves of data collection:

  • 1981-1984
  • 1989-1993
  • 1994-1999
  • 1999-2004
  • 2005-2007

Bangladesh has data for the third and fourth wave.

First, lets reproduce the 71.7% number.  We can use the interactive query tool WVS has set up on their website at: http://www.wvsevsdb.com/wvs/WVSIntegratedEVSWVSvariables.jsp?Idioma=I

To get this:wvs_sample1

“Mentioned” basically means “Yes” (we’ll explore this in detail in a bit).

What about the 1996 survey? Here’s the same page with 1996 data (table shown only):

wvs_sample2

If the oddity isn’t jumping out to you yet, let’s use something a little more visually friendly to compare the 1996 and 2002 numbers. WVS also makes available a Online Data Analysis toolkit at http://www.wvsevsdb.com/wvs/WVSAnalizeQuestion.jsp. With a literal tinkering, we can get to this screen:

wvs_sample3

“Mentioned” (the measure for less tolerance) went from 17.3% to 71.7%, while “Not Mentioned” went from  82.7% to 28.3%.

Either:

a) something WTHBBQPWN&%$#* happened in those 6 years that made 54.4% Bangladeshis do a 180 degree on their tolerance levels, OR

b) the coding got messed up and “Mentioned” is either 82.7% and 71.7% in 1996 and 2002 respectively, or 17.3% and 28.3% respectively.

For a survey size of N = 1,500ish, b) is always your safer bet when no obvious change agent is involved, endogenous or exogenous.

(If you’re Bangladeshi, you’re also probably laughing your behind off at a) since the 54.4% number involves tens of millions of people in a generally syncretistic society where appreciation for that heritage has arguably only increased with the younger generations, but that’s “anecdotal” for the purposes of this piece, so we’ll leave that thought there.)

If your data analysis foo is up to the task, I’d encourage you to check out the raw data itself to confirm that this is also the case there. The dataset is generously available at: http://www.wvsevsdb.com/wvs/WVSData.jsp

I used the STATA file for the “WVS FIVE WAVE AGGREGATED FILE 1981-2005″ dataset; you can also choose SPSS or SAS formats if they suit your toolkit better. For Stata, the relevant command is:

tab S002 A124_02 if S003 == 50

Giving the result:

                    | Neighbours: People of
                    |   a different race
               Wave | Not menti  Mentioned |     Total
--------------------+----------------------+----------
          1994-1999 |     1,261        264 |     1,525 
          1999-2004 |       425      1,075 |     1,500 
--------------------+----------------------+----------
              Total |     1,686      1,339 |     3,025

 

(S002 is the Wave, A124_02 is the question under study, and S003 contains the country, with Bangladesh being code 50.)

What Should The Data Say?

(Spoiler: Bangladeshis are a tolerant bunch – it’s ok to come visit.)

Ok, now that we are reasonably certain the data is confused, which way will it point once we de-confuse it? For this, we need to turn to the actual questionnaires. These too are available at http://www.wvsevsdb.com/wvs/WVSDocumentation.jsp?Idioma=I.

First, the 1996 survey. The relevant section is reproduced below (Bangladesh_WVS_1996_1.pdf, pg 10):

wvs_sample4The top line is the question, basically asking who the respondent “does not want or would not like to have as a neighbor”.

The first column, উল্লেখ করেছেন, means “Mentioned” – to be selected if the respondent notes a particular group of individuals (V51 – V60) are unwelcome neighbors.

The second column, উল্লেখ করেন নি, means “Not Mentioned” – this is for the more chillaxed bunch.

 

 

Now, let’s look at the 2002 survey. The relevant section is reproduced below (Bangladesh_WVS_2002_1.pdf, pg 19):

wvs_sample5The top line has the same question as above, verbatim. The first or second column headers are not the same though.

The first column, প্রতিবেশী হিসেবে পছন্দ করবো, means “Would like [X] as a neighbor” – to be selected if the respondent notes a particular group of individuals (V68 – V77) are welcome neighbors.

The second column, প্রতিবেশী হিসেবে পছন্দ করবো না, means “Would not like [X] as a neighbor” – this is for the less chillaxed bunch.

 

Yeah… oops!

’1′ and ’2′ stand for totally the opposite things in the two surveys.

Unless we are willing to allow that the data input folks consciously converted a ’2′ in 2002 to a ’1′ in 1996 to connect প্রতিবেশী হিসেবে পছন্দ করবো না to উল্লেখ করেছেন, I think it’s reasonable to assume that ’1′ was also coded for “Mentioned”/উল্লেখ করেছেন in the 2002 dataset, leading to a flip in the results for that question for that wave. Spot checks also suggest that this is what would be consistent with surveys from other countries.

With everything righted the right way, here then is what the final numbers look like:

wvs_data_endresult

Yes, 28.3%, not 71.7%.

Caveats

  1. I only looked at the question that was used for the tolerance/intolerance WaPo piece. This inconsistency shouldn’t be extrapolated to any part of the rest of the Bangladesh survey unless one has double checked for that.
  2. I ran this for the Bangladesh dataset only. No idea if any of this applies to any other country dataset – same caution as above applies.

Posted by Ashirul Amin

Comments (13) Trackbacks (13)
  1. One good piece of work Ashirul bhai. I must thank you and your wisdom.

  2. Excellent work. Thanks for your analysis which clearly proves that the result was a misrepresentation and the survey itself is error ridden.

    Hails!

  3. Well done bro

    বাংলাদেশীদের ডাটা লই চুদুর বুদুর ছৈলত ন ;)

  4. Hi Mr. Amin,
    Great work. I am trying to do a brief piece for The Daily Star on the WashPo piece and follow-up analyses such as this. I plan to quote from here and duly attribute to you. I hope you are okay with this. You may email me adnanramin@gmail.com.

    Thanks.

  5. If I know my Bangladesh right, the result published is nothing but ridiculous. Bangladeshis will discriminate against Bangladeshis when it comes to housing or having one as a neighbor; Bangladeshis would love a to have a neighbor [ X ] rather than another Bangladeshi.

  6. The article has done lot of damage to the reputation of Bangladesh and Bangladeshis. Washington Post editor should have publish a half-page apology for publishing it.

  7. Good articile. However it’s a bit complicated indeed, it brings forward what’s important. Thanks for sharing!

  8. You got the data wrong for Hong Kong
    The translator made it inverted

    http://badcanto.wordpress.com/2013/05/19/hong-kong-is-not-the-most-racist-region-in-the-world/

  9. Thank you friend for finding the fat finger!

    We hope that similarly someone will find that India is not after all the most intolerant of all countries.

    And I dare say that Pakistan appears to me less tolerant than the study finds it to be.

  10. Good job!

  11. Big thanks for discovering the fact

  12. Thanks for the precious analysis, though a bit complicated. May you simplify this and prepare a summary for the people and the media? If you cooperate I’d like to circulate to the newspapers and TV channels in Bangladesh.

  13. This report is a sham and I really doubt the authenticity of the statistics. I am a Bangladeshi and I do not think even 25% people will not like to see their neighbors a different race as foreigners are treated with much hospitality. So it suspect-surveys really gone BAD. This is a pragmatic feedback and not a criticism.
    Cheers, Al