… or, A Story of How a Coding
C*Screw-Up Made Bangladesh One of the Least Tolerant Countries in the World. (Spoiler: It isn’t!)
What We Thought We Knew
Yesterday, the Washington Post put out a story, A fascinating map of the world’s most and least racially tolerant countries. In that map, India and Bangladesh stuck out like a baboon’s butt as bastions of intolerance in the world. It’s reproduced below; red implies that more people said they “would not want neighbors of a different race”.
In fact, the percentages are more than a little damning:
“India, Jordan, Bangladesh and Hong Kong by far the least tolerant. In only three of 81 surveyed countries, more than 40 percent of respondents said they would not want a neighbor of a different race. This included 43.5 percent of Indians, 51.4 percent of Jordanians and an astonishingly high 71.8 percent of Hong Kongers and 71.7 percent of Bangladeshis.”
Three thoughts occurred to me in this order:
- Wow that’s an odd basket of countries to be lumped together as the most intolerant!
- Ouch! Yeah I’m Bangladeshi.. and while I’ll be the first to admit we have our own favorite national stereotypes and periods of ethnographically inspired excitement, least tolerant? Really?
- I wonder if someone fat fingered on this big time.
Thanks to the fact that both Max Fisher of WaPo and World Values Survey folks freely shared their sources, we can take a dive into the data that generated the map to explore thought #3 to our heart’s content.
The short answer is, yes, someone did fat finger this big time. “Yes” and “No” got swapped in the second round of the survey, which means that 28.3% of Bangladeshis said they wouldn’t want neighbors of a different race – not 71.7%.
26K Facebook likers and 2.5K Tweeters, take note.
Now, the long version for the data wonks amongst you. By the way this piece is restricted to Bangladesh – time, and ability to read primary questionnaire being main constraints.
What the WVS Data Really Says
(Spoiler: Data says it’s confused…)
There were 5 waves of data collection:
Bangladesh has data for the third and fourth wave.
First, lets reproduce the 71.7% number. We can use the interactive query tool WVS has set up on their website at: http://www.wvsevsdb.com/wvs/WVSIntegratedEVSWVSvariables.jsp?Idioma=I
“Mentioned” basically means “Yes” (we’ll explore this in detail in a bit).
What about the 1996 survey? Here’s the same page with 1996 data (table shown only):
If the oddity isn’t jumping out to you yet, let’s use something a little more visually friendly to compare the 1996 and 2002 numbers. WVS also makes available a Online Data Analysis toolkit at http://www.wvsevsdb.com/wvs/WVSAnalizeQuestion.jsp. With a literal tinkering, we can get to this screen:
“Mentioned” (the measure for less tolerance) went from 17.3% to 71.7%, while “Not Mentioned” went from 82.7% to 28.3%.
a) something WTHBBQPWN&%$#* happened in those 6 years that made 54.4% Bangladeshis do a 180 degree on their tolerance levels, OR
b) the coding got messed up and “Mentioned” is either 82.7% and 71.7% in 1996 and 2002 respectively, or 17.3% and 28.3% respectively.
For a survey size of N = 1,500ish, b) is always your safer bet when no obvious change agent is involved, endogenous or exogenous.
(If you’re Bangladeshi, you’re also probably laughing your behind off at a) since the 54.4% number involves tens of millions of people in a generally syncretistic society where appreciation for that heritage has arguably only increased with the younger generations, but that’s “anecdotal” for the purposes of this piece, so we’ll leave that thought there.)
If your data analysis foo is up to the task, I’d encourage you to check out the raw data itself to confirm that this is also the case there. The dataset is generously available at: http://www.wvsevsdb.com/wvs/WVSData.jsp
I used the STATA file for the “WVS FIVE WAVE AGGREGATED FILE 1981-2005″ dataset; you can also choose SPSS or SAS formats if they suit your toolkit better. For Stata, the relevant command is:
tab S002 A124_02 if S003 == 50
Giving the result:
| Neighbours: People of | a different race Wave | Not menti Mentioned | Total --------------------+----------------------+---------- 1994-1999 | 1,261 264 | 1,525 1999-2004 | 425 1,075 | 1,500 --------------------+----------------------+---------- Total | 1,686 1,339 | 3,025
(S002 is the Wave, A124_02 is the question under study, and S003 contains the country, with Bangladesh being code 50.)
What Should The Data Say?
(Spoiler: Bangladeshis are a tolerant bunch – it’s ok to come visit.)
Ok, now that we are reasonably certain the data is confused, which way will it point once we de-confuse it? For this, we need to turn to the actual questionnaires. These too are available at http://www.wvsevsdb.com/wvs/WVSDocumentation.jsp?Idioma=I.
First, the 1996 survey. The relevant section is reproduced below (Bangladesh_WVS_1996_1.pdf, pg 10):
The first column, উল্লেখ করেছেন, means “Mentioned” – to be selected if the respondent notes a particular group of individuals (V51 – V60) are unwelcome neighbors.
The second column, উল্লেখ করেন নি, means “Not Mentioned” – this is for the more chillaxed bunch.
Now, let’s look at the 2002 survey. The relevant section is reproduced below (Bangladesh_WVS_2002_1.pdf, pg 19):
The first column, প্রতিবেশী হিসেবে পছন্দ করবো, means “Would like [X] as a neighbor” – to be selected if the respondent notes a particular group of individuals (V68 – V77) are welcome neighbors.
The second column, প্রতিবেশী হিসেবে পছন্দ করবো না, means “Would not like [X] as a neighbor” – this is for the less chillaxed bunch.
’1′ and ’2′ stand for totally the opposite things in the two surveys.
Unless we are willing to allow that the data input folks consciously converted a ’2′ in 2002 to a ’1′ in 1996 to connect প্রতিবেশী হিসেবে পছন্দ করবো না to উল্লেখ করেছেন, I think it’s reasonable to assume that ’1′ was also coded for “Mentioned”/উল্লেখ করেছেন in the 2002 dataset, leading to a flip in the results for that question for that wave. Spot checks also suggest that this is what would be consistent with surveys from other countries.
With everything righted the right way, here then is what the final numbers look like:
Yes, 28.3%, not 71.7%.
- I only looked at the question that was used for the tolerance/intolerance WaPo piece. This inconsistency shouldn’t be extrapolated to any part of the rest of the Bangladesh survey unless one has double checked for that.
- I ran this for the Bangladesh dataset only. No idea if any of this applies to any other country dataset – same caution as above applies.
A little while back, we realized we were nipping at the heels of “big data” territory, arriving at near 1 terabytes of data. We being the team at Bankable Frontier Associates I work with, through a partnership with the Center for Emerging Markets Enterprises at Fletcher. Not quite a size that would titillate folks who drink Hadoop and sleep in Elastic Clouds, but it was a sobering moment that caused for some reflection on the limits of what we could do with all this information, even as we strained a pretty souped up machine to it’s limits.
Most of my concerns stem from the fact that big data has the disconcerting property of confessing to something – anything – under sufficient coercion. It’s a variation of the age-old problem of statistical correlation, aptly captured in the XKCD to the right –>
As the venerable Nassim Taleb points out, “We’re more fooled by noise than ever before, and it’s because … with big data, researchers have brought cherry-picking to an industrial level. … I am not saying here that there is no information in big data. There is plenty of information. The problem — the central issue — is that the needle comes in an increasingly larger haystack.“
When you’re dealing with data on tens of millions of accounts and billions of transactions from financial institutions serving clients in eight countries, that’s a rather massive haystack to get lost in. In situations like this, it is ever so important to have a set of null hypotheses that can be proved/disproved conclusively, thereby keeping us honest, instead of chasing spurious connections.
Which brings us to correlation vs causation. Yes, we have granular transaction data over a course of years for each account holder, meaning we know everything they are doing with that account. We can also have up to twenty characteristics of the client and the account type – age, gender, income, occupation, age of account, interest rate paid, etc.
But unless such studies are paired with detailed financial diaries, we know nothing of the individuals motivations for why they do what they do, or of the rest of their financial portfolio and financial tools at their disposal. This means we usually cannot say things like, “the average account holder saves Ksh X for her child’s school uniform”.
And that’s ok.
Causality in the social sciences is a hard problem. It’s not possible to hold “everything else constant” like we can with the hard sciences. Human free will allows for a mind-boggling array of choices, people may not always take the same decision despite being faced with the same choices, and somethings an effect may have multiple contributing causes.
Quantitative researchers do the best they can to account for all possible explanatory variables and then attribute degrees of causality to certain variables. Because we don’t have all possible explanatory variables when dealing with big data, we restrict ourselves to demonstrating strong correlations and usually end up indicating potential causal connections and let others take it from there – such as field researchers who can conduct focus groups to dig in deep.
This may not sound intellectually gratifying, but it is once you get into the thick of things. Let’s consider the example of two savings types: A, which is a short-term, low-balance almost transactional behavior, and B, which is accretionary savings over the course of a year leading to a decent balance. A is strongly correlated with ATM card usage, while B is strongly correlated with branch usage. 20-40% of all savings accounts seem to display A-type behavior, while about 1% display B-type behavior across many of the financial institutions we have looked at and I can talk about. What questions come to mind? How about:
- Do ATMs make it hard to save larger amounts over the long term because it’s just so easy to take money out? Do branches make it harder for folks to withdraw funds willy nilly and therefore save more over the long term?
- Or.. do clients self-select to use ATMs in cases where they need easy access to money and intermediate small amounts through that channel, leaving large amounts of transactions aimed at towards building that large lump sum for some purpose to happen at branches, not least because they don’t feel safe hauling a satchel of catch to an ATM in the middle of nowhere?
The implications of potential answer(s) can be profound. The first would imply that while we have celebrated ATMs as a successful de-congestion measure for banks, reducing staff load, client wait-times and operational expenses associated with physical branches, they have also caused people to save less, which can be antithetical to the cause of financial inclusion. On the other hand, the second would imply that branches still have certain benefits that are not being captured by other channels, and more effort needs to be made to address this convenience/security factor.
Of course, as with any complex system, the actual answer probably contains kernels of truth from both possibilities, and then some. Unless ridiculously fortuitous natural experiments present themselves with just the right incentives, say through subtle product rule changes intended to “nudge” a certain type of behavior, it’s well nigh impossible to seek answers to these kinds of questions irrespective of how big “big data” is.
(Btw, having 1% of accounts display a particular type of behavior across different banks in different countries is highly interesting in itself, since there is nothing definitional that would force this to happen. But that’s another story.)
I, for one, sleep peacefully at night knowing that often, all I can expect to get from “big data” are glorified correlations; anything else is gravy.
Saving in a Lending-to-save Product
We know that folks who have to deal with incomes that are low, irregular and uncertain have to resort adapting available financial instruments to meet their idiosyncratic needs. This is another post on one of my favorite datasets – P9 – that illustrates a simple but powerful adaptation. (You can read previous post here.)
You’ll recall that P9 is a lending-to-save product, where a certain proportion of the earmarked amount is held back as savings, which is then replaced with cash flow from the client once the loan portion has been paid off. This implies that you have to pay off the loan amount first, before you can really save. If nothing else, the discipline of paying off the loan in small increments is transferred to saving in small amounts towards a large lump sum.
Except, what if you only wanted to save, and didn’t need or want the loan?
It seems that a certain portion of the clients at the Hrishipara site (P9 is offered in two sites – Hrishipara and Kalyanpur) have adapted the product to this end by paying off the loan within the first day of disbursement presumably using the same amount they had taken out, and then spend the next few weeks or months saving up. Clients thus seem to have taken the conscious decision to do away with the lending half of the “lending-to-save” model but have voluntarily taken on the discipline expected of them as they save up towards the amount held in escrow on their behalf.
Tracking Down the “Only Savers”
The first clue that something was not going exactly according to plan was this plot:
This plot tells us what percentage of the tranche is paid off as the first payment. To fully grasp what this is showing, let’s first set some expectations. Say you decide to pay off an outstanding amount of Tk 1,000 in 10 equal installments of Tk. 100. How much of the tranche are you paying off per payment? Why, 10% of course (Tk. 100/Tk 1,000). What if you decided to pay it off in 20 installments of Tk. 50? Each payment would then constitute 5% (Tk. 50 / Tk. 1,000).
Of course, this can also be calculated by taking the reciprocal of the number of payments as a percentages – 1/10 = 10%, 1/20 = 5%, and so on. We wouldn’t expect the first payment to be anything different per se from the “average” payment, so our expectation of the size of that first payment would also be 10%, 5% or x% depending on whether we expect 10, 20 or n payments, where x = 1 / n as a percentage.
Thus, the graph above tells us that in 56% of the tranches, the first payment is 10% of less than the entire disbursement amount – something we would expect. But check that 27% in the blue circle – these folks have paid off around half the disbursement amount through the first payment. And the clients in the green circle – the 5% – have paid off almost all, or all, of the disbursement amount right at the first payment!
What is going on with the folks in the red circle!?
The examples are pretty self-explanatory. The table below is for the blue “Save Only” folks – you can see the almost-equal amounts for the loan and the repayment made, with the delta essentially being a fee of Tk. 10-100.
And the table below is for the green “Ramp Up” folks – you can see that the repayments are equal to the disbursement amount:Yes, clients are paying off the entire tranche amount. This is generally done because you have to cycle through smaller tranches before you are earmarked a larger tranche, and these guys have simply decided to do that cycling in one go. Most clients will cycle through one or two such tranches, but one particularly adept client went through 7 tranches in 8 days, cycling from Tk. 3,000 to Tk. 13,000.
I have to say, it’s not often that a pattern jumps out like this – if only portfolio analytics was generally this readily discoverable!
Adaptation Behavior Over Time
How consistent is this “savings only” behavior? Do they do the same thing tranche after tranche, or do they go back to taking advantage of the loan option? If you consider the blue circle folks as “Saving Only” and the green circle folks as “Ramp Up” clients, with the remaining as “Neither”, you can envision a 3 x 3 transition matrix between each tranche where a client in any of the three “states” can choose to be at any of the other three “states”.
The complete state transition figures are given below as a percentages of the number of accounts that have gone through that tranche. We stop at the 20th tranche because less than 50 accounts have gone through more, resulting in a lot of noise.
That’s a lot of numbers.. so let’s just focus on these three rows: “Neither -> Neither”, “Neither -> Save Only” and “Save Only -> Save Only”. The first goes from 74% to 44%, the second fluctuates between 2% and 14%, and the third goes from 8% to 26%. Thus, fewer and fewer clients continue the lending-to-save model, and more and more save only.
A closer snapshot of this dynamic is given below by focusing on the two states of “Neither” and “Save Only” and looking at the 2nd, 10th and 20th tranche:
What Does This All Mean?
Well, at the end of the day it’s fairly simple – P9 at Hrishipara has certain rules that its clients found a way to serve their need better when they were interested in saving only. Quantifying the phenomenon gives us a sense of how widespread it is, and allows product designers to account for deviations from expected behaviors. (I haven’t looked at the P9 Kalyanpur data yet but my sense is that the product there is more flexible and accommodates this behavior already.)
One subtlety that you’ll probably appreciate is this usage behavior indicates the preference clients have of having the option to draw down a loan amount even if they do not exercise that option all the time – in fact, around the 20th tranche, about a tenth of the tranches exercise the option to draw down after saving only in the previous tranche.
The write-up on which this post is based can be found at the P9 Databank. It benefited greatly from Stuart Rutherford’s feedback.
Why, you can save through all of them, of course!
That was a key part of the intuition that gave rise to the three savings types outlined in BFA’s InFocus Note #3: Combining demand and supply side insights to build a better proposition for banks and clients. This post walks through a some of the highlights of this Note.
The Need for A New Savings Nomenclature
But, you may ask, why on earth do we need to come up with new types? Well, mostly because we didn’t find anything out there that did justice to the nuances in savings behavior we were seeing, and because we had tons and tons of data and so could segment at the granularity that client-based surveys could not accommodate. Systematic classification of savings types is sparse, and frankly, my favorite is still the oldie-but-goldie from Stuart Rutherford’s The Poor and Their Money. There is “saving up,” “saving down,” and “saving through.” You can read about this here, here and here, but basically the first is classic savings, the second is classic credit, and the third is a mixture of the two (like health insurance). Turns out voluntary savings accounts can display behavior that cannot be satisfactorily classified into one of these three.
We were also looking for pattern based matches solely based on account and balance information from the MIS, without any clue as to why savers were doing what they were doing. (We went on to combine this with client surveys afterwards, but that’s another story.) The patterns had to be sensible and discernible from each other, but they also had to be very precise to match the precision of the data we had on our hands. And on a personal level, it just fun to be able to craft software bots that crawl through the 0′s and 1′s to provide the kind of insights we gained!
X101: The A, B and C of Savings
Anyway, so coming back to our mattress->cow story… One can save a small amount, or a larger one. One can save it for a short period of time, or longer. And, one can save it in a form that allows ready access to cash, or in one that takes a bit of effort to liquidate. Generally speaking, one tends to store smaller amounts of money for a shorter period of time in a more liquid form at one end of the spectrum, and larger amounts of money for longer periods of time in rather illiquid forms.
Combining this intuition with our mattress-savings club-cow triptych gives us:
As self-explanatory as this graph is to you and me, it means absolute jack to Python, our programming language of choice. We needed a way to translate what you are seeing above to numerically defined filters that classified accounts based on one of more indicators.
We settled on the following rules for our pet algorithms through a process that relied largely on descriptive analytics of the underlying dataset and Daryl Collins‘ extensive experience with the financial lives of the poor – a process that was really part science, and part art.
Note that while clients may display all three types of behaviors, not all are welcome by banks. Type A are particularly expensive to maintain, since they not maintain adequate balances for the bank to book sufficient income on the float of that balance.
Not all accounts would fall into one of the three types. The two below captured the leftovers with some level of activity. Those which showed no activity are simply marked dormant.
The “Active but nor Savings” bucket contains accounts that display “dump and pull” behavior, where individuals use the account as a temporary repository between when cash inflows and outflows, and is typical of salary deposits or social grants.
We call this entire nomenclature “X101″. The genesis of this name involves thinking of this exercise as an X-ray that provides a basic-level dissection of savings accounts.
The X101 Wagon Wheel
Once we apply this nomenclature to the underlying savings accounts, we get breakdown that are specific for each of the financial institutions we looked at. One example is given below; it’ll give us a sense of the kind of information we can get from something even this aggregated. (Source: InFocus Note #3, page 10)
- A full half of the accounts are dormant! (Yes, it’s amusing how the number is exactly 50%..) Uptake followed by non-usage is a nagging problem for many of these institutions.
- About half of the accounts that are not dormant display A-, B- or C-type behavior. Seems like only a quarter of the accounts this institution services are really saving.
- B-type saving is hard to do! Recall that this is the one analogous to the savings clubs, which requires considerable discipline. But voluntary savings accounts do not have discipline enforcement mechanisms by definition, and few have incentives either.
- The rest are about evenly split between the “dump and pull”-ers and the folks who can maintain some kind of balance some of the time, but not all the time.
Is this what you would have expected, based on what you know about savings accounts?
Looking through the X101 Lens
Now that we have this classification of the accounts, we can look at existing information through a new lens, so to speak. Two examples are given below.
The first involves asking how much it costs to support each of these types of accounts. Below are the net revenue numbers in USD for one of the banks:
So.. other than Type B, all other types are losing money for the savings division.. Not so good from a financial sustainability point of view, specially considering Type Bs typically make up a small sliver of total savers. (These figures include the amortized customer acquisition costs and monthly maintenance charges, by the way.) This sort of analysis is the beginning of the discussion surrounding the business case of savings accounts, and how things can be different.
The second involves this thing called “channel dominance” – a creation of the venerable David Porteous. Financial institutions offer their services through different channels, such as branches, ATMs, agents, mobile vans, mobile phones etc. We consider an account to be displaying a certain channel dominance if the number of transactions the client conducts using that channel exceed those conducted through any other channel by at least 50%.
For one of the banks, the breakdown of channel dominance by X101 types looked like so (“Other” implies that the account did not fall in any one of the dominance buckets):
So .. we see that:
- Type A savers love ATMs! Easiest to withdraw cash, maybe?
- Type B savers really love branches! Could going to branches be providing some of the discipline needed for this kind of saving?
- Type C savers don’t really have a particular preference between ATMs or branches, but they sure don’t like agents… Maybe access to agents makes it hard to maintain balances over a long period of time?
- Balance Managers look like Type C savers as far as the channel distribution is concerned.. Perhaps they just need a nudge or three to become Type Cs?
Yes, the purported causal chains I casually drop above are purely speculative. But this line of thinking gave food for some great discussions with the institution in question, who know their clients really, really well.
The Big Picture
I think the X101 nomenclature has the potentially to materially impact the conversation around low-income savers and their savings accounts. It’s a rather quantitative approach that focuses on the how, which when married with the qualitative why provides fascinating insights into savings-oriented financial inclusion. This is important because saving is often hard for the client to do, and appropriate savings products are often challenging for the banks to design. X101 can inform this discussion, and we’ve been having some fascinating discussions indeed.
I found some pretty nifty Python code online that allows one to calculate Excel-like XIRR, and used the publicly available P9 data as meat for the grinder. This post shares the goodies that came out through the other end.
P9 is a pretty cool savings-and-loan product managed by Start Rutherford and SafeSave. Clients take a certain amount out and commit a significant portion of it to a sort of savings escrow. First, they pay down the loan, and then accumulate up to the amount of savings that is held in that escrow. This mechanism provides an immediate access to cash in the short term, and builds up savings in the longer term.
There are a couple of things that stand out about P9, two of which particularly piqued my interest:
- Clients can take however long they want to pay back the drawn down amount, and they can pay back as often (or not-so-often) as they want, and
- There is no interest rate associated with the draw down, only an up-front charge of 1% or 3%.
So … how long do clients take to pay back? And, how much are they paying for this service in effective interest rates (EIR)? Let’s take a look.
Keeping it short and sweet
P9 has about 800 clients, and they have collectively gone through almost 5,000 cycles. Each of those cycles are counted separately (and not all the cycles are counted here – see fine print below). The overall distribution is like so:
Do you see something interesting here? There are relative peaks around the 30, 60 and 90 day marks. They’re not massive, but they are accentuated by the troughs on either side. There is nothing in the product design that would reinforce a 30-, 60- or 90-day cycle, so there must be some kind of external cash flow event these line up would, unless the client is self-enforcing this regularity. Possible candidates could be salaries, remittance inflows and other microfinance institution (MFI) disbursements that do enforce periodicity – but I’m just guessing here.
Thus, 2/3 of the clients pay back within 90 days, and virtually all do so within the year.
This is good news, in that not only does P9 preserve its capital, but manages to cycle it multiple times within a year. The range of cycle lengths also suggests that there is demand for flexible-duration loan products – a feature that products offered by MFIs sorely lack.
But.. (yes, there’s always a “But..”) if clients are going through multiple cycles, they are also paying the up-front fee multiple times. And by the laws of compound interest, 2% and 2% tends to add up to more than 4%.
No Surprises with the EIR
How bad could it get? Well, the extreme case is someone going through 1-day cycles of 1%-fee drawdowns. This gives a EIR of 3,500%. You’ve also probably seen pay-day loans carrying EIRs of hundreds of percents. So hypothetically at least, it can get pretty bad.
This is what it looks like for P9:
The EIRs for the shortest cycles are pretty high, as expected, and tapers off rapidly as cycle lengths get longer.This relationship holds at all percentiles, also as expected:
If you’re worried about the 156% in the 90th percentile, note that this is for “30 days or less” bucket, and involves cycles which are a couple of days long, at most.
There is a certain amount of variability in the repayments, as allowed by design, so the EIRs aren’t exactly what one would expect with a uniform paydown. If more of the payments happen earlier on, the EIR is bumped; if more of the payments happen later on, the EIR is reduced.
Words of Caution
First, this analysis doesn’t take into consideration all cycles clients have gone through. It ignores the about 1,000 cycles that are involved with top-ups, and another 200 that were discarded for various reasons. This leaves about 3,700 cycles for this analysis. Top-ups were ignored because it requires extra-special care when stitching consecutive cycles, and I’ll do it when I have some more time.
Second, while EIRs are very useful for analytical purposes for apples-to-apples comparisons, they tend to lose their utility a bit when very short time frames are involved. By virtue of their compounding nature, they assume that all returns will be reinvested continually too, in addition to principal, which is hardly the case in real life from the client’s point of view. Thus, the 156% we picked on above very, very probably has no connection to anything in reality in that client’s life.
Special thanks to MFTransparency’s Tim Langeman who shared the Python code needed to calculate the EIR using cashflow discounting, just like Excel’s XIRR function, in this post. His work is based off of Skipper Seabold’s post here. It saved me a lot of time being able to re-engineer their work for my needs.