Andrew Gelman alerted us to a classical probability error that showed up on the Freakonomics blog recently. (by way of Xian).
Probabilistic thinking is not very intuitive. One wishes the Freakonomics team would attack conventional wisdom, instead of embracing it. Here, they trumpeted the "very long odds in the Israeli Lottery": the fact that the same six numbers won the same lottery three weeks apart. This feels like an extremely rare event but it is a case where we tend to underestimate severely its rarity.
My readers may want a refresher of what the mistake is.
***
Let's start with the much simpler "Birthday Problem". Say you're in the jury selection room with a hundred people. What's the chance that you can find two people born on the same day? The chance is much higher than you would imagine! In fact, the math shows that it is more or less 100%! Even if you have 60 people in the room, it is almost certain. With only 30 people in the room, there's about a 65% chance of finding at least two with the same birthdays.
The key to understanding this puzzle is to realize that we did not fix the day of birth; we are not looking for two people born on January 1st. The two people born on the same day could have been born on any one of 365 days (excluding Feb 29 for convenience). The chance that we can find two people born on exactly Jan 1 is much less likely than the chance of finding two people born on the same day.
Similarly, there are 4,950 distinct pairs of people in the room, any one of which can be the pair of people with the same birthday. So, there are many ways in which we can find two with the same birthday and that's why it is much more likely than we think it is.
The point being: If you take enough drives in the safari, you will find a leopard.
***
Turning to the Israeli lottery. The "rare" event in question can also happen in many ways: the numbers could reappear one, two, three, ..., ten weeks apart (assuming it's a weekly lottery); and the reappearance could happen in any one of I'd think thousands of lotteries all over the world. In fact, every single time a lottery is drawn anywhere in the world, there is a chance that the outcome will prompt some journalist to write up the same story: in X lottery, the same Y numbers won Z weeks apart, what long odds!
The point being: The more drives you take in the safari, the more likely you will eventually find a leopard.
***
Perhaps the difficult part of taking this argument is: why should we be thinking about these imaginary scenarios if we already know that it happened in the Israeli lottery and it happened three weeks apart? Doesn't the reality make the imaginary scenarios irrelevant?
Here is how I'd think about this: if you already know that the Israeli lottery drew the same numbers six weeks apart, then the question of what are the odds is silly because the odds are 100% since we know for sure that it happened!
The question of "what are the odds?" makes sense only at the moment when we don't know if something would happen or not. That means one of many possible scenarios could happen.
This need to place ourselves in a state of uncertainty is what I think trips up a lot of people. We do not like uncertainty.
***
Because of the same reason, some people (most mainstream journalists, it seems) buy the government line that we "made money" or "did not lose money" from the TARP rescue of failed too-big-to-fail banks.
This analysis is done from a state of certainty, from hindsight, taking into account the banks thus far survived the danger. But evaluating TARP from this view is like celebrating a lottery win after you found out you are the winner. The fact that you won the lottery does not change the fact that economically, it was silly to play the lottery in the first place.
When a decision is made in a state of uncertainty, it has to be evaluated in that state of uncertainty. For it is the uncertainty that imposes the greatest cost. In the case of TARP, the key question is to determine whether the government negotiated the right terms on behalf of the people at the time of the decision, when the banks faced enormous existential uncertainty. There is little question that the government did not get a good deal. Just take a look at the kinds of deals that the banks themselves get when they save failing companies from bankruptcy! (This often involves the banks owning a vast majority of the shares of said companies in addition to multifarious onerous conditions.) Or, how banks act when we don't pay our credit card or loan bills: is it okay for us to stop paying for a year, repay the amount in full, and declare no harm done?
Of course, there were a zillion other policies not called TARP that directly benefited banks at a cost to the people, and thus a variety of arguments why the bank rescue was not costless. My favorite is the suspension of mark-to-market accounting so banks can list their assets at fictitious prices (some of these assets are cash flows from "liar loans" and so on). I will outsource the big picture to economist Dean Baker who does a great job "beating the press" every day (here, and here).
I'm going to push back on this post a bit (the lottery part) which might ruffle some feathers, so I'd ask that you take a moment to think about this from a different perspective.
First, I have no issues with the probability calculations involved. Namely, if you define the event and sample space as Kaiser does, I am perfectly comfortable with the result, namely that the event isn't terribly improbable.
My problem is with the notion that there is a very clear sense in which there is a "right" or "wrong" way to interpret this lottery situation. For instance, what would you say to a purely hypothetical person who read this post and responded thusly:
"Fine! I agree with your reasoning, that if you include all lotteries over long periods of time, repeated numbers like this is actually quite probable. However, that isn't how I was thinking about this situation. I live in Israel, and I only follow this particular lottery. So from my point of view, what happens in other lotteries isn't terribly interesting or important. Also, I happened to begin playing this particular lottery at the beginning of this particular three week period. So I only really paid attention to this particular three week span. So from my perspective, I thought it was quite unusual that in the only three weeks that I have ever played the lottery, the same numbers appear more than once. Why is your perspective 'right' and mine 'wrong', given that I understand and am ok with both?"
I hope the point of that example was clear...
It seems to me that answering probability questions has (broadly speaking) two parts. First, we translate our intuitive understanding of the problem into a formal definition of an event/sample space. Second, we count how often our event of interest can happen.
My position is that the latter is far more conducive to right/wrong answers than the former. Now, I'm not arguing that every conceivable translation of a problem into formal defs of events/sample spaces is equally justifiable in all cases.
If we were modeling drug response curves to measure the effectiveness of a new treatment, then sure, we'd need to think carefully about whether our translations of any probability problems make sense for the questions we're trying to answer. But what is at stake with the lottery problem? What terrible consequences will befall us if we choose to interpret the event/sample space differently and come to the conclusion that it is unusual? (Provided we are clear about the assumptions we made in the process, and we understand that had we made different assumptions, we would have arrived at a differently conclusion!)
"Textbook" problems like these (and the "boys born on Tuesday" problem) are high-stakes only from a pedagogical perspective. They are important for learning how to think clearly about the problem of translating a probability problem into a formal statement of an event/sample space. The important lesson is not that we get the translation "right" but that we learn the skill of thinking clearly about our translation, how it influences our answer, and how it relates to other possible translations.
(And here's where I might ruffle some feathers...)
The traditional attitude adopted here by Kaiser (and Gelman, and others) is, IMHO, a stunning failure from a pedagogical perspective. I call it being a "probability scold". The message is that probability is that your intuition is always wrong! It revels in tricking people. (Think of all the classic teaching examples: Monty Hall, Birthday Problem, etc.) Students who are adept at the subject to begin with may respond well to this, but in my experience students who struggle with probability often respond poorly. It makes them feel stupid ("The only reason you thought it was likely was because you don't understand probability theory!") and reinforces the notion that they'll never gain an intuitive understanding of the subject at all, so why bother.
People's natural curiosity about numerical coincidences ought to be used as a hook to encourage their interest in probability, not as a way to lure them into being exposed as wrong or ignorant. ("Let's think carefully about what 'unusual' might mean in this circumstance!" vs. "Aha, see how wrong your intuition was!")
My preferred approach in these problems is not to declare one interpretation "right" and one "wrong", but use it as an opportunity to discuss how our translations of intuitive notions of chance/likelihood/unusualness influence our answers.
To me, the point of this lottery example is not whether it is or is not unusual. I think that depends completely on your point of view and in this particular case it isn't terribly important whether you think that it is or isn't.
What is important is that we can clearly explain the assumptions we made in translating the problem into a formal sample space/event, and that we can explain how our answer may change using different assumptions.
I want to close again by re-emphasizing that I have no particular problem with Kaiser's interpretation of the lottery example such that it leads him to conclude that the event is reasonably probable. That's fine! I don't even think the question of whether this event is unusual is even all that important. What is important is understanding the process of translating a vaguely worded news piece into a formal probability problem, how different people might approach that process differently and how these differences might influence our answers.
So whenever I see prob/stats experts howling about these "mistakes", I just want to say: Stop being such a probability scold! ;)
Posted by: jme | 11/04/2010 at 12:40 PM
I remember when mark to market accounting was blamed for the Enron collapse. Never did I suspect that the government would allow the infamous hypothetical future value accounting standard to be used as an alternative!
Posted by: Cody L. Custis | 11/04/2010 at 01:59 PM
jme: I actually agree with much of what you say about how we teach probability (and statistics) and the idiocy of teaching through puzzles and brainteasers. I personally detest brainteasers, such as the "boys born on Tuesday" which is arranged to illustrate an anomaly.
That said, I have two issues with your argument:
1) I agree that we should not dismiss alternative approaches out of hand but not everything deserves consideration; I want something that is well thought out, with supporting theory, and can be applied generally.
2) How rare the coincidence is is an interesting question since people's response to such events is frequently "how unlikely!" On the contrary, I find it hard to believe that anyone would be interested in knowing "what is the chance that the Israeli lottery happening this week will turn up with the same set of numbers as exactly three weeks ago?"
Posted by: Kaiser | 11/04/2010 at 11:57 PM