You can follow this conversation by subscribing to the comment feed for this post.

I'm going to push back on this post a bit (the lottery part) which might ruffle some feathers, so I'd ask that you take a moment to think about this from a different perspective.

First, I have no issues with the probability calculations involved. Namely, if you define the event and sample space as Kaiser does, I am perfectly comfortable with the result, namely that the event isn't terribly improbable.

My problem is with the notion that there is a very clear sense in which there is a "right" or "wrong" way to interpret this lottery situation. For instance, what would you say to a purely hypothetical person who read this post and responded thusly:

"Fine! I agree with your reasoning, that if you include all lotteries over long periods of time, repeated numbers like this is actually quite probable. However, that isn't how I was thinking about this situation. I live in Israel, and I only follow this particular lottery. So from my point of view, what happens in other lotteries isn't terribly interesting or important. Also, I happened to begin playing this particular lottery at the beginning of this particular three week period. So I only really paid attention to this particular three week span. So from my perspective, I thought it was quite unusual that in the only three weeks that I have ever played the lottery, the same numbers appear more than once. Why is your perspective 'right' and mine 'wrong', given that I understand and am ok with both?"

I hope the point of that example was clear...

It seems to me that answering probability questions has (broadly speaking) two parts. First, we translate our intuitive understanding of the problem into a formal definition of an event/sample space. Second, we count how often our event of interest can happen.

My position is that the latter is far more conducive to right/wrong answers than the former. Now, I'm not arguing that every conceivable translation of a problem into formal defs of events/sample spaces is equally justifiable in all cases.

If we were modeling drug response curves to measure the effectiveness of a new treatment, then sure, we'd need to think carefully about whether our translations of any probability problems make sense for the questions we're trying to answer. But what is at stake with the lottery problem? What terrible consequences will befall us if we choose to interpret the event/sample space differently and come to the conclusion that it is unusual? (Provided we are clear about the assumptions we made in the process, and we understand that had we made different assumptions, we would have arrived at a differently conclusion!)

"Textbook" problems like these (and the "boys born on Tuesday" problem) are high-stakes only from a pedagogical perspective. They are important for learning how to think clearly about the problem of translating a probability problem into a formal statement of an event/sample space. The important lesson is not that we get the translation "right" but that we learn the skill of thinking clearly about our translation, how it influences our answer, and how it relates to other possible translations.

(And here's where I might ruffle some feathers...)

The traditional attitude adopted here by Kaiser (and Gelman, and others) is, IMHO, a stunning failure from a pedagogical perspective. I call it being a "probability scold". The message is that probability is that your intuition is always wrong! It revels in tricking people. (Think of all the classic teaching examples: Monty Hall, Birthday Problem, etc.) Students who are adept at the subject to begin with may respond well to this, but in my experience students who struggle with probability often respond poorly. It makes them feel stupid ("The only reason you thought it was likely was because you don't understand probability theory!") and reinforces the notion that they'll never gain an intuitive understanding of the subject at all, so why bother.

People's natural curiosity about numerical coincidences ought to be used as a hook to encourage their interest in probability, not as a way to lure them into being exposed as wrong or ignorant. ("Let's think carefully about what 'unusual' might mean in this circumstance!" vs. "Aha, see how wrong your intuition was!")

My preferred approach in these problems is not to declare one interpretation "right" and one "wrong", but use it as an opportunity to discuss how our translations of intuitive notions of chance/likelihood/unusualness influence our answers.

To me, the point of this lottery example is not whether it is or is not unusual. I think that depends completely on your point of view and in this particular case it isn't terribly important whether you think that it is or isn't.

What is important is that we can clearly explain the assumptions we made in translating the problem into a formal sample space/event, and that we can explain how our answer may change using different assumptions.

I want to close again by re-emphasizing that I have no particular problem with Kaiser's interpretation of the lottery example such that it leads him to conclude that the event is reasonably probable. That's fine! I don't even think the question of whether this event is unusual is even all that important. What is important is understanding the process of translating a vaguely worded news piece into a formal probability problem, how different people might approach that process differently and how these differences might influence our answers.

So whenever I see prob/stats experts howling about these "mistakes", I just want to say: Stop being such a probability scold! ;)

I remember when mark to market accounting was blamed for the Enron collapse. Never did I suspect that the government would allow the infamous hypothetical future value accounting standard to be used as an alternative!

jme: I actually agree with much of what you say about how we teach probability (and statistics) and the idiocy of teaching through puzzles and brainteasers. I personally detest brainteasers, such as the "boys born on Tuesday" which is arranged to illustrate an anomaly.

That said, I have two issues with your argument:
1) I agree that we should not dismiss alternative approaches out of hand but not everything deserves consideration; I want something that is well thought out, with supporting theory, and can be applied generally.
2) How rare the coincidence is is an interesting question since people's response to such events is frequently "how unlikely!" On the contrary, I find it hard to believe that anyone would be interested in knowing "what is the chance that the Israeli lottery happening this week will turn up with the same set of numbers as exactly three weeks ago?"

The comments to this entry are closed.

##### Get new posts by email:
Kaiser Fung. Business analytics and data visualization expert. Author and Speaker.
Visit my website. Follow my Twitter. See my articles at Daily Beast, 538, HBR, Wired.

## Search3

•  only in Big Data
Amazon - Barnes&Noble

Numbersense:
Amazon - Barnes&Noble

## Junk Charts Blog

Graphics design by Amanda Lee

## Next Events

Jan: 10 NYPL Data Science Careers Talk, New York, NY

## Past Events

Aug: 15 NYPL Analytics Resume Review Workshop, New York, NY

Apr: 2 Data Visualization Seminar, Pasadena, CA

Mar: 30 ASA DataFest, New York, NY

See more here