In this week's Statbusters (link), we discuss two recent widely-shared articles, one on deaths while taking selfies, and the other on the gender gap in income among graduates of top-tier universities.
The common element between these two pieces is a reductionist analysis that looks at the correlation between a single variable X and an outcome Y when the outcome Y is affected by a multitude of variables. For example, it is reported that female graduates of Princeton or Harvard or Stanford earns $50,000 less annually than male graduates ten years after graduation. The two groups being compared do not differ just by gender, despite what that statement implies.
Note, however, that there is nothing wrong with the computation. Similarly, the deaths linked to taking selfies indeed outnumber deaths by shark attacks. The problem is that the analysis is misleading. It causes readers to come to the wrong conclusions.
You may have seen the clickbait of the week: "Selfies have killed more people than sharks this year!"
There are three problems with this statement:
1. Almost everyone takes selfies but many, many fewer of us are exposed to shark attacks. Thus even if the absolute number of deaths from selfies is higher than that of deaths by sharks, the risk of dying from taking selfies is still much lower.
2. The headline did not lie. It is worse than lying. It misleads readers into thinking that selfies are more dangerous.
3. Your risk of dying from selfies is ridiculously low. (only 12 deaths in 9 months, and countless exposures-in both sense of the word)
Now, if someone died from taking a selfie in front of a shark, that would be something.
PS. [9/24/2015] A dialogue on Twitter reveals problem #4: the deaths while taking selfies were attributed to taking selfies but could have been attributed to other more direct causes such as climbing on top of a moving train. Meanwhile, sharks do directly attack and kill people.
To borrow from probability theory, one can apply the "memoryless" property to our news media: it appears that current news reports do not need to retain memory of past news reports.
The other day, I commented on new reports about doping that suggest that there is widespread belief that steroids are not abused by small-time athletes because they have no chance of winning--when for the longest time, the belief was that steroids were not abused by big-time athletes because they had too much talent or worked too hard to require extra help.
So now there is an article on ZDNet about Groupon that states with a straight face:
There was long speculation that the daily deals craze would eventually die out, with many analysts deeming the model unsustainable because of its contentious relationship with merchants. Seemingly realizing its shaky future, Groupon attempted to refocus its business more around e-commerce, but the company's best efforts appear to have fallen short.
As I documented in Numbersense, the mass media was in euphoric consensus about Groupon's world-conquering prospects... to the tune that the company was able to raise about $1 billion in the public market (most of which immediately went to pay off initial investors).
As a reminder, this was a post I wrote before Groupon's IPO pointing out the flaws in their business model.
The majority of the business and technology press was not speculating that the "daily deals craze would eventually die out". In fact, we were told Groupon would become the next eBay or Amazon.
For those who read the New York Times's stories denigrating BMI as an obesity metric, you have heard only a small part of the story. Their latest senasationalist coverage says that 18% of BMI-obese people are not really obese, with 12% labeled as "healthy obese" and 6% called "skinny fat".
This is a major missed opportunity by an important newspaper in promoting deeper thinking about scientific data.
The first question to ask when faced with those numbers is how is "true" obesity defined? A claim that 18% of BMI-obese people are misclassified is a claim by someone that they know the true state of "obesity". But how is obesity defined? You know what, there is no objective measure. BMI is based on height and weight. Other measures are based on waist size, body fat, and so on.
Those numbers make an assumption that body-fat percentage as measured by the DXA method is God's word on obesity. Someone is making a claim that the 12 percent of people who are obese under BMI but not obese under DXA is "healthy". The claim is not even that these people are "not obese"--they are "healthy". This isn't science.
Further, DXA requires a body scan on an expensive machine while BMI is a measure that can be computed and monitored by anyone at home. That alone makes using DXA as a measure impractical.
Further, there is a huge literature that establishes the correlation between BMI and a variety of health problems. There is very little research that shows DXA as correlated with health problems. (Much of this is because DXA is not readily measured.)
Further, the chart is a bit misleading because officially, BMI-obese is BMI over 30. Between 25 and 30 is called "overweight". Almost all of the "misclassifications" are BMI-overweight so those "healthy obese" are just under the BMI-obese definition. In any case, the worst that could happen to these 12% is that they are asked to exercise, eat healthy, etc. What is the cost of such misclassification? The "skinny fat" group is a bit more concerning but where is the proper scientific evidence proving that this group faces abnormal death risk from fat?
Chapter 2 of Numbersense (link) has the full story on this misguided argument about DXA vs BMI. The bottom line is: our obesity crisis will not be solved by changing how we measure obesity.
In the newest column for the Daily Beast, Andrew and I look at the media's fascination with expressing large numbers as daily numbers. (link) In short, you should divide by 365 only when the metric actually scales with time, and be careful if the metric is not evenly distributed across time. We discuss the following headlines: "Air pollution is China is killing 4,000 per day" and "Periscope users view 40 years of video per day".
This piece is part of the StatBusters column written jointly with Andrew Gelman. Hope they fix the labeling soon. In it, we talk about two recent studies on data privacy, which leads to contradictory conclusions. How should the media report such surveys? Is the brand name of the organization enough? In addition, we debunk the notion that consumers will definitely get something valuable out of sharing their data.
Are science journalists required to take one good statistics course? That is the question in my head when I read this Science Times article, titled "One Cup of Coffee Could Offset Three Drinks a Day" (link).
We are used to seeing rather tenuous conclusions such as "Four Cups of Coffee Reduces Your Risk of X". This headline takes it up another notch. A result is claimed about the substitution effect of two beverages. Such a result is highly unlikely to be obtained in the kind of observational studies used in nutrition research. And indeed, a glance at the source materials published by the World Cancer Research Fund (WCRF) confirms that they made no such claim.
The headline effect is pure imagination by the reporter, and a horrible misinterpretation of the report's conclusions. Here is a key table from the report:
The conclusion on alcoholic drinks and on coffee comes from different underlying studies. Even if they had come from the same study, you cannot take different regression effects and stack them up. The effect of coffee is estimated for someone who is average on all other variables. The effect of alcohol is estimated for someone who is average on all other variables. The average person in the former case is not identical to the average person in the latter case. So if you add (or multiply, depending on your scale) the effects, the total effect is not well-defined.
In addition, you can only add (or multiply) effects if you first demonstrate that the two factors do not interact. If there is interaction, the effect of alcohol is different for people who drink less coffee relative to those who drink more. The alcohol effect stated in the table above, as I already pointed out, is for an average coffee drinker. Conversely, the protective effect of coffee may well vary with alcohol consumption.
The reporter also misrepresented the nature of the analysis. We are told: "In the study of 8 million people, cancer risk increased when they consumed three drinks per day. However, the study also found that people who also drank coffee, offset some of the negative effects of alcohol."
The reporter made it sound like a gigantic randomized controlled study was conducted. This is a horrible misjudgment. WCRF did not do any study at all, and certainly no researcher asked anyone to drink specific amounts of alcohol or coffee. The worst is the comment on people who drank coffee as well as alcohol. I can't find a statement in the WCRF report about such people. It's simply made up based on the false logic described above.
At one level, the journalist misquoted a scientific report. At another level, the WCRF report is rather disappointing.
The authors of the executive summary repeatedly use the language of causation. For example, "There is strong evidence that being overweight or obese is a cause of liver cancer." Really? Show me which study shows obesity "causes" liver cancer?
Take one of their most "convincing" findings: "Aflatoxins: Higher exposure to aflatoxins and consumption of aflatoxin-contaminated foods are convincing causes of liver cancer." The causation is purely an assumption of the panel who reviewed prior studies. In Section 7.1, readers learn that this cause-effect conclusion comes from "four nested case-control studies and cohort studies" for which "meta-analyses were not possible". So not a single randomized trial and no estimation of the pooled effect.
What is nicely done in the report is the inclusion of "mechanisms" which are speculative explanations for the claimed causal effects. It's great to have thought carefully about the biological mechanisms. Nevertheless, these sections are basically "story time" unless researchers succeed in establishing those unproven links.
Dragged by infectious incuriosity, the financial press ran with the story that falling gasoline prices (50% drop in 6 months) is "the best economic stimulus one can get". See former Deputy Treasury Secretary Robert Altman on CNBC, Business Insider's "cheap gas boost", Wall Street Journal citing the "low oil prices as an effective tax cut for consumers", New York Times quoting a Citigroup analyst claiming a global > $1 trillion stimulus, etc. etc.
This is the kind of story that one should believe only if half asleep. Here are three reasons why this conjecture is likely to be wrong:
1. Forgetting the big picture
There was a McDonalds next to a Burger King in a small town. The Burger King went out of business. The McDonalds suddenly did twice the usual business. Surely, McDonalds was the winner here but did the economy of the town expand? Unfortunately not. The consumers merely shifted their spend from Burger King to McDonalds.
Now, consider a household that spends $200 a month on gas before the oil price crash. Let's say the same amount of gas now costs $100. According to those rosy-cheeked economists and journalists, the household now has an extra $100 to spend on other things, and this "extra" spending stimulates the economy.
But the total amount of expenditure is still $200. The only thing that changes here is the mix of spending. GDP is based on total spend, not the mix of spend. Some sectors of the economy will benefit but at the expense of the oil and gas sector.
2. Imperfect substitution
Consider our household again. The total economy size remains the same only if the household spends every dollar of the $100. If the household saves even one of those dollars, the economy shrinks, compared to before.
3. Making bad assumptions about the future
It's unclear from any of those articles how the analysts came up with the size of this oil-drop stimulus. Every one of them must make a forecast about future oil prices. I bet many of them take the current price as the new normal, and use that price as the future price.
If I tell you, you should not take an extreme value and treat it as the average, you'd scold me for stating the obvious.
As with most economic arguments, one could posit a much more complex chain of relationships that would argue how one goes from 50% drop in oil prices to trillions of economic stimulus. It is the business journalist's job to explain that complicated chain. The connection is clearly not as simple as reported. If one establishes a chain, such as A up ->B down ->C down -> D up, etc., each one of those causal links should be supported with evidence.
The same type of fallacious thinking pervades the business sector. For example, we keep hearing about the growth in retail sales from mobile devices. We don't know if consumers are shifting from the Web channel to the mobile channel, and how much of the mobile sales are incremental.
Journalism suffers from an archiving challenge in the digital age, which I wrote about here.
Even worse is the fate of data graphics. This has always been an issue, as digital archives of newspapers do not save any of the graphics. (Try going to the New York Times archive to see for yourself).
The new wave of graphing technology is making this problem worse!
The new technology embeds charting instructions within the HTML code itself, which means that the chart is assembled "on the fly". Think of each chart existing as a collection of pieces: legend title, legend text, axes, etc. This presents many challenges:
All pieces in one integral image (jpg, png, etc.) is relatively easy to save. How does one save a chart that exists in a dozen pieces?
One can never be sure what the reader is seeing. Did all the pieces render properly? Does the chart look differently depending on which browser, which OS, which device is rendering it?
In some applications, the browser might make a call to a remote database to fetch the data for rendering the chart. This means it's possible for someone looking at a chart this morning to see something different from someone else looking at the "same" chart in the afternoon. Since those are two different readers, no one will even notice the difference. Does saving the graphic now involve saving snapshots of the database too?
Let me illustrate the above with a recent example from the New York Times--the graphic about jealousy in dogs I discussed last week on Junk Charts (link).
Here is a screenshot of the chart as it appeared to me at the time I saved it:
The reason I did a screenshot is that when I right clicked to save the chart, it gave me the following:
The saved image is missing the text labels, legend titles, etc., all these being rendered as separate instructions on the HTML code itself.
Here is the output of the code inspector. The saved image corresponds to one line of the code:
The legend text of the red box is itself a separate line of code:
If you keep going, you will learn that the second legend text is a separate line of code, so is the third legend text. The axis labels on the right are rendered in four separate pieces.
With so much work going into these data graphics, I really hope our industry will rally and figure out a way to archive the work.
PS. Would have loved to have been a fly on the wall at this meeting: http://t.co/FGUNYmLx07
Scott Klein and others attempted an ambitious definition of what needs to be saved. Their work is mostly to do with complex apps, which of course are even hairier than saving the simple static chart I discussed above.