The Vancouver Winter Olympics 2010 is almost over, and no athlete has tested positive thus far. By comparison, at the Beijing Summer Olympics, about a dozen athletes tested positive out of almost 5,000 tests. In Chapter 4 ("Timid Testers / Magic Lassos") of Numbers Rule Your World, I discuss the statistics behind steroid testing, including how to interpret the number of positive cases. This leads to several surprising conclusions, such as why negative tests don't mean much.
I will appear on Ken Broo's sports radio show on 700 WLW on Sunday, 2/28 between 9:30 and 10:30 CST (not the entire show) to chat about drug testing in sports. Tune in if you can!
In looking at the details of Chapter 1, I neglected to discuss its theme. The part of Freakonomics that appeals to me concerns how data is harnessed to answer interesting questions. Beneath the stories, Chapter 1 is primarily concerned with the collection of data, rather than the analysis of data. Indeed, what count for analysis consist of a few sample averages (e.g. how much the "typical prostitute" earns?) and a few subgroup comparisons (e.g. the relative costs of different sex acts).
Turning now to Chapter 2 (the "terrorism" chapter). I find the material here much richer for the statistically-minded reader, and well worth my time.
The chapter has a tri-partite structure: the first section deals with a dazzling assortment of statistical factoids, in a presentation that will either infuriate or engage the statistician, as I will explain below; the second section looks at how ER doctors can be compared even though the assignment of patients is not randomly determined; and the third section describes how one British mystery person uses bank data to find suspected terrorists.
As I indicated, Chapter 2 Part 1 (pp. 57-62) will either infuriate or engage you. A large variety of statistical factoids are examined, from which I list three representatives:
Pregnant Muslim women who took part in Ramadan fasting had babies who grew up to have higher incidence of visual, hearing, or learning disabilities. As a result, certain cohorts (by birthdays) of Muslims have disproportionately higher incidence of such disabilities.
Coaches of youth soccer leagues in Europe picks the oldest children within each age group delimited by a cutoff birthday of December 31, causing the birth month distribution of players to skew towards Jan, Feb and March (as opposed to Oct, Nov and Dec)
The practice of listing co-authors in alphabetical order on economics journal articles means economists with last names starting with "A" have a greater chance of winning the Nobel prize.
If this were a statistics book, the author will use these examples to illustrate the notion of "spurious correlations". Within the Muslim community, there is a correlation between certain birthdays and higher incidence of disabilities. However, this is a spurious correlation because the day of birth does not cause disabilities; what is happening is that those birthdays are correlated with fasting mothers, and fasting causes some babies to grow up with disabilities.
L&D take a different approach; they play up the correlations for effect. They say things like "it is no exaggeration to say that a person's entire life can be greatly influenced by the fluke of his or her birth." (p.58) In the case of soccer leagues, they say "birth timing may push a marginal child over the edge." (p.62).
For this situation, I would stress that birthday is a useful indicator of a child's likelihood to make the league but it is not a cause. The reason why the birth month distribution is skewed is that the kids born in Jan, Feb or March are older and stronger than those born in Oct, Nov or Dec and therefore are more likely to earn the coach's favor.
The discussion of economics Nobelists is still stranger. L&D cite the researchers' conclusion that "one of us is currently contemplating dropping the first letter of her surname", adding that the "offending" name was Yariv. Why would any economist want to change his or her name to begin with "A"? The only reason I know is the belief that having a last name beginning with "A" causes one to have a greater chance to win a Nobel.
It is clear that L&D knows the difference between causation and correlation so I think this is an attempt to make the material interesting. By using this presentation, it forces me to delve into what's a cause and what's not; therefore, I find it engaging. Others may find it infuriating.
Other thoughts on Part 1:
p.59 -- If the women who survived the Spanish flu pandemic then suffered "terrible luck" "over their whole lives", are L&D saying it would have been better for them to have died from the flu?
p.61 -- I'm not sure how this sentence escaped Levitt's attention; this is an egregious error:
Most youth [baseball] leagues in the U.S. have a July 31 cutoff date. A U.S.-born boy is roughly 50 percent more likely to make the majors if he is born in August instead of July. Unless you are a big, big believer in astrology, it is hard to argue that someone is 50 percent better at hitting a big-league curveball simply because he is a Leo rather than a Cancer.
Likelihood to make the majors is not the same as likelihood to hit a big-league curveball! Indeed, in such a competitive field, the difference in batting averages between a kid who makes the majors and one who narrowly misses out is likely to be a matter of hundredths or even thousandths. While on average, the August class may have a 50 percent higher likelihood of making the majors, the batting average of the August class is extremely unlikely to be 50 percent higher than that of the July class.
(The last sentence also shows that they realize date of birth is not a cause. That's why I think the presentation style is deliberate.)
p.62 -- In a reference to the above baseball example, L&D make the side comment that in determining a boy's chance of making the majors, other factors may be "infinitely more important than timing an August delivery date". Are they thinking about the birthday as a cause or a correlation? I can't decide. (Trying to time the delivery would correspond to believing that being born a Leo rather than a Cancer would help, which seems to contradict the bit on p.61.)
p.61-2 -- On p.61, they talk enthusiastically about Anders Ericsson who argues that stars are made, not born. L&D even wrote an article called "A Star is Made". On p.62, they disclose two almighty factors that are much more important than "birth effects" for being able to play in the majors: being born a male, and having a father who played in MLB. But aren't both those factors born, not made?
p.62 -- They end with this assertion: "So if your son doesn't make the majors, you have no one to blame but yourself; you should have practiced harder when you were a kid." I learn a couple of things from this: (1) their readers are men; (2) training harder causes me to have a higher chance of making the majors, which causes my son to have a higher chance of making the majors.
Will write about the rest of Chapter 2 in a future post.
F.D.A., Nissen [a cardiologist] and G.S.K. [GlaxoSmithKline, the pharma] all come to comparable conclusions regarding
increased risk for ischemic events [heart attacks], ranging from 30 percent to 43
If everyone agrees that taking Avandia, the blockbuster G.S.K. diabetes drug, increases the risk of heart attacks, what's the problem? Why is Avandia still being prescribed? Read this New York Timesarticle to find out.
While this article contains very little science, it is not to be missed by anyone interested in real-life decision-making in our society. For those working in industry, it's better to realize sooner rather than later that getting the analysis right is only the beginning of the end. And that is because the kind of mathematical argument statisticians are trained in is not useful in convincing people who are not similarly trained. This article gives us a rare glimpse behind the scenes: we learn about who the players are, and moreover, we eavesdrop on some of their (secretly recorded) conversation.
This article mirrors the perspective that I take in putting together Numbers Rule Your World. There are people in these stories, not all of them are statisticians, and they shape the final decisions. The math gets us to the beginning of the end, but the path to the end is often tortuous, and sometimes, even mathematical optimality has to take a back seat.
Update [3/29/2010]: This post has found a new life over at Andrew Gelman's blog.
I'm glad Andrew Gelman read the book: he's not one to get addicted to pop-statistics books. He noted:
I liked that Kaiser is sending the message that this all makes sense:
rather than trying to portray probability as counterintuitive and
puzzle-like, he's saying that if you think about things in the right
way, they will become clear.
He made an interesting point about methods versus people. And he rightly surmised that I focused on the people because it's hard to make methods engaging to the reader, unless the reader wants to read about methods. Another reason is that in practice, many methods yield similar results -- statistically different but practically the same, and it's the people and their motivations that ultimately determine the course of action.
Christian Robert also wrote a nice review. Both he and Andrew recognize that my objective is to bring young people into the field of statistics, and to convert stubborn adults who think statistics is boring. Christian notices a difference in style compared to Taleb:
The overall tone of Numbers rule your worldis pleasant and engaging, at the other end of the stylistic spectrum from Taleb’s Black Swan. Fung’s point is obviously the opposite of Taleb’s: he is showing the reader how well statistical modelling can explain for apparently paradoxical behaviour.
Len Testa, who creates touring plans for Walt Disney World nuts using statistics and operations research techniques, reviewed the book on his TouringPlans.com blog:
Coincidentally, I’d read SuperFreakonomics last
week. Numbers Rule Your World is written in a similar, easy to read
manner, and focuses on statistics instead of economics. If you liked
Freakonomics you’ll definitely enjoy Numbers.
Many readers will, or have, read SuperFreakonomics. I'm making my way through the book, and keeping a log of my thoughts. Here is how one statistician takes in Chapter 1 (the "sex" chapter).
p.20 -- was surprised to learn that women used to have shorter life expectancy than men. I have always thought women live longer. This factoid is used to show that throughout history, "women have had it rougher than men" but "women have finally overtaken men in life expectancy". I'm immediately intrigued by when this overtaking occurred. L&D do not give a date so I googled "female longevity": first hit said "it appears that women have out survived men at least since the 1500s, when the first reliable mortality data were kept."; the most recent hit cited CDC data which showed that U.S. females outlived males since 1900, the first year of reporting. In the Notes, L&D cite an 1980 article in the journal Speculum, published by the Medieval Academy. In any case, the cross-over probably occurred prior to any systematic collection of data so I find this minor section less than convincing.
p.20 -- L&D tell us "In China,... females are still far more likely than males to be abandoned after birth, to be illiterate, and to commit suicide." How should one interpret such statistics? My hunch is that among countries with similar literacy rates as China, it is probably true that females are more likely to be illiterate than males. If so, is the gap in China significantly larger than in other countries? The UN data is easy to find: overall, male, female adult literacy -- China: 91, 95, 87; Singapore: 93, 97, 87; Malaysia: 89, 92, 85; Phillippines: 93, 93, 93; Thailand: 93, 95, 91; Mexico: 91, 92, 90; Indonesia: 90, 94, 87, etc. In no way is the inequity in adult literacy in China special. The comment on suicides makes more sense as in most countries, men are more likely to kill themselves but it's the reverse in China.
p.21 -- L&D cite "For American women twenty-five and older who hold at least a bachelor's degree and work full-time, the national median income is about $47,000. Similar men, meanwhile, make more than $66,000, a premium of 40 percent." I'm assuming $66,000 is a median income as well. A ratio of two median incomes is not very useful; it tells us nothing about the distributions of the male and female incomes (which are very skewed). A more useful statistic is the percentile of $47,000 in the male income distribution: in other words, the mid-rank female earns less than X% of male counterparts.
p.21 -- They are chatting about causes of the male-female wage gap. "Even within high-paying occupations like medicine and law, women tend to choose specialties that pay less (general practitioner, for instance, or in-house counsel). And there is likely still a good amount of discrimination. This may range from the overt -- denying a woman a promotion purely because she is not a man -- to the insidious." I wish they made the duality of the cause--effect linkage clearer. The first factor claims women selects low-paying jobs while the second factor says high-paying jobs (their hiring managers) selects men. This is a common hiccup in causal inference research: which direction does the arrow of causality point?
p.22 -- They make the argument that Title IX boosted the appeal of coaching jobs for women's sports teams. To prove this, they say only 6 out of 13 WNBA teams had female head coaches as of 2009. For some reason, next they tell us ten years ago, only 3 of 14 WNBA teams had female head coaches. Are they saying the prestige of WNBA coaching jobs has declined in appeal over time? I'm confused.
pp.23-4 -- They cite several statistics of the weekly wages of prostitutes in Chicago, in historical dollars as well as in current dollars. First there was a girl who took in $25 a week in old dollars, and $25,000 a year in current dollars. This girl was described as "at the very low end of what Chicago prostitutes earned". So I'm expecting to learn the higher wages others make. The next sentence reads: "a woman working in a 'dollar house' (some brothels charged as little as 50 cents; others charged $5 or $10) took home an average weekly salary of $70, or the modern equivalent of about $76,000 annually." I just couldn't figure out how the words inside the parentheses relate to the rest of the sentence. A "dollar house" doesn't sound like a place where a lot of money is made.
p.23 -- A study estimated that "1 out of every 110 women in that age range [15-44] was a prostitute". This type of statistic is designed to make us think someone in this restaurant (or train, etc.) is a prostitute. But most often, it is misleading. The number is computed by dividing the number of prostitutes by the number of women. It assumes that every woman has the same chance of being a prostitute which is obviously not true. L&D realize this and add: "1 out of every 50 American women [in their twenties] was a prostitute." This doesn't go far enough. Later, on p.32, they inform us that "prostitution is more geographically concentrated than other criminal activity", which means that the chance that a twentysomething is a prostitute is highly dependent on where she lives.
pp.27-8 -- Has a very nice description of why survey research has many limitations, especially when it comes to asking questions about sensitive subjects, like sex, stealing, racism and so on. A precautionary tale for reading polling and market research data.
pp.28-9 -- Pondering how, and why, Venkatesh's method is better. Are former prostitutes more likely to elicit the truth about prostitution than others? If one wants to learn about male chauvinism, would male workers be more likely to get to the truth than female workers? (It's unclear if the former prostitutes were paid; they use the word "hired". The prostitutes being studied were paid.) This highlights the importance of understanding the motivations (and resulting biases) of data collectors. The bias introduced by paying participants is well known in the survey arena but tolerated in order to have an acceptable response rate.
p.29 -- They cite statistics about "the typical prostitute in Chicago." In what ways are the subjects of the study "typical" and in what ways are they not typical? The sample size was 160. They don't say much about the selection process of the subjects, except that they all came from three South Side neighborhoods. Would like to know more about the selection.
p.29 -- "At least 3 of the 160 prostitutes who participated died during the course of the study." Don't use the phrase "at least"! It sounds sloppy, and it is sloppy as "at least 3" includes "everyone". This is a documented study with a small sample; they should know exactly how many died.
p.30 -- After much buildup, we get to their surprise: "Why has the prostitute's wage fallen so far?" I'm looking for the data, what does it mean by "so far"? All we have is the assertion "the women's wage premium pales in comparison to the one enjoyed by even the low-rent prostitutes from a hundred years ago." On the previous page, we learn that modern "street prostitutes" earn $350 per week. On p.24, we learn that in the past, Chicago prostitutes took in $25 a week, "the modern equivalent of more than $25,000 a year". Unfortunately, neither of these two numbers is comparable to $350. Dividing $25,000 by 50 weeks (approx.) gives $500 per week. So the drop is $150 off $500, or 30%. But... this is a comparison of wages from prostitution, not of "wage premium". On p.29, the modern study found "prostitution paid about four times more than [non-prostitution] jobs." On p.23, they say "a tempted girl who receives only $6 per week working with her hands sells her body for $25 per week" so we can compute the historical ratio as $25/$6 = 4.17 times. So, I must have gotten the wrong data.
pp.30-31 -- some interesting comparison stating that only 5 percent of men today lose their virginity to a prostitute but 20 percent for those born in the 30s. Just be reminded of their earlier warning about truthfulness in research studies involving sensitive topics.
p.32 -- They assert "prostitution is more geographically concentrated than other criminal activity: nearly half of all Chicago prostitution arrests occur in less than one-third of 1 percent of the city's blocks." I have several problems with this sentence. What is the concentration of other criminal activities? Arrests are not the same as prevalence. And, a few pages later (p. 41), they will make the startling claim that "a Chicago street prostitute is more likely to have sex with a cop than to be arrested by one."
p.33 -- A table of sex acts and their average prices. It's important to establish the sample sizes underlying the average prices. The researcher documented 2,200 sex acts, and the least frequent act accounted for 9% of those, so about 200 acts. To establish the margin of error around those averages, I'd also need the spread of the individual prices.
p.40 -- They compare a real estate agent to a pimp. Some data is used to justify the claim that the Internet has reduced the power of real estate agents while the internet "isn't very good -- not yet, at least -- at matching sellers to buyers". Therefore, the impact of a pimp is larger than that of a real estate agent. Would like to see a study of Internet substituting pimps. As it stands, this is an assertion without proof.
p.46 -- Some of the language is overdone. They say the men "blew away" the women in a version of an SAT-style math test with twenty questions. What does "blowing away" mean? Scoring 2 more correct questions out of 20.
pp.47-8 -- Tackle a study on the wage change of men or women who underwent sex change operations. As they point out, this study really doesn't answer the question of what might happen if men are randomly made into women, or vice versa. The problem is this is not a random selection. The study found men who became women lost a third of their previous wages. This would imply they did not keep their prior jobs. But does this job change show women gravitate to poorer-paying jobs, or that higher-paying jobs select men? The direction of causation crops up again, and we are no closer to the answer.
The rest of the chapter -- They discuss Allie, a high-end prostitute. This section has little interest for a statistician since it is a sample of one.
Please do let me know if this sort of review is useful or not.
Since I announced the book to Junk Charts readers last week, a number of people have noticed and helped spread the word. I just want to let you know I'm very grateful! Here, in alphabetical order (with apologies to those I missed):
John Sall, co-founder of SAS and champion for their superb JMP software, namedNumbers Rule Your World as the first in the next crop of books dealing with the rise of analytics in management. He also kindly contributed a review:
a book that engages us with the stories that a journalist would write, the compelling stories behind the stories as illuminated by the numbers, and the dynamics that the numbers reveal.
Ian Ayres, who wrote one of the few books on business analytics (Super Crunchers), liked one aspect of my book:
For those who have anxiety about how organizational data mining is impacting their world, Fung pulls back the curtain to reveal the good and the bad of predictive analytics.
I am a big believer in data-driven decision-making but also in the "use-with-care" principle.
The Press of Atlantic City interviewed me about SAT test design for this article.
Paul Krugman has written a nice piece explaining the "death spiral" problem in the insurance business. He said:
Bear in mind that private health insurance only works if insurers can
sell policies to both sick and healthy customers. If too many healthy
people decide that they’d rather take their chances and remain
uninsured, the risk pool deteriorates, forcing insurers to raise
premiums. This, in turn, leads more healthy people to drop coverage,
worsening the risk pool even further, and so on.
The background is the teetering state of some private health insurers in California.
Chapter 3 of Numbers Rule Your World addresses this problem, using Florida hurricane insurers as an example. Statistics is behind the entire concept of insurance. With large numbers of customers, insurance firms can predict their annual losses (i.e. payouts) with a high degree of certainty. This is a result of the law of large numbers. Because individuals, especially when young, will have difficulty predicting when, and how, they will die, they have an incentive to buy insurance.
But hurricane insurance, or more broadly, disaster insurance, firms face even thornier issues than health insurers. Health insurers can reliably predict and control what proportion of their customer base will get sick each year. What about hurricane insurers? For them, "healthy customers" are analogous to residents with homes in areas that are not hurricane-prone; do we think they are carrying hurricane insurance?
PS. For the economists out there, the question being tossed around is: are hurricane risks insurable? can a private market for hurricane insurance survive?
Dentist: I only manage to clean the exposed part of the teeth. In your X-ray, we can see tartar buildup underneath the gums. Your teeth will fall out eventually if we don't clean it up now.
Statistician: My teeth feel fine, in fact, the best in years. I don't like the cost-benefit tradeoff of deep cleaning. All my friends who did it did not think it helped.
Dentist: It didn't work for them because they didn't floss. Let me tell you about my friend ...
Dentists seem to have special friends who like to tell them horror stories about how their teeth fell out at a young age. I am not interested in the worst-case scenario, nor a sample size of one, not randomly selected.
So I went home and googled "deep cleaning" and "clinical trials", and "deep cleaning" and "number needed to treat". No informative results except for more scare stories with no data. If you know data or experience related to this, please comment.
I would like to understand:
How many people have to do "deep cleaning" for one patient to benefit from the procedure? (This is the "number needed to treat". I would not be surprised if this number is 100. A lot of medicines have high NNT, meaning we are buying lottery tickets.)
How is the "benefit" of this procedure defined in the literature? My teeth will never fall out? They will fall out two days later than if I didn't do it? The "pocket" between my teeth and gums will reduce by 1%? 5%? 10%?
If the "pockets" are reduced in size by 1%, how much longer will I live?
Laura Landro pretty much parroted the official FDA line on food recalls in her Wall Street Journal article ("Why Some Foods Are Riskier Today", Feb 15). The only thing I could find in the piece that is relevant to the headline is that U.S. consumer demand for "year-round fresh produce and fish" leads to rising imports from "countries that don't have the same level of sanitary practices as required in the U.S." Only if the food is produced by Americans in America.
Except... she started off the article by citing several recent food recalls:
sausage and salami from Rhode Island
chewy chocolate chip granola bars from California
cheese in Washington state
and closed with chocolate chip cookie dough from Virginia.
Her larger point - that food recalls have become more and more frequent - is addressed in Chapter 2 of Numbers Rule Your World. I predict that this trend will snowball, even if we ban all imported foods. As citizens, we need to understand the statistics behind food recalls because they bring some surprises:
Many food recalls are not likely to have saved any lives
All food recalls cause economic losses, sometimes substantial
Rarely is the FDA certain that the recalled foods really caused the alleged illnesses
If anyone could identify the cause of foodborne disease outbreaks, I would trust the FDA/CDC staff with it (but see above)
I'll be interviewed on Dan Angelo's Education Today show on KUCR (Riverside, CA), to be broadcast on Feb 16 at 9:30 pm EST and Feb 17 at 11:30 am EST.
We will be talking about the statistics behind the design of SAT tests, which is the subject of Chapter 3 of Numbers Rule Your World. The SAT is the most scientifically sound standardized test we have, and dozens of ETS statisticians work tirelessly to create it. I discuss the concepts behind some of the things they do.
A link to the audio will be placed here when available.