« Round up of book news | Main | What is Numbersense about? »


Feed You can follow this conversation by subscribing to the comment feed for this post.


I don't mind that they are using observational data, as it would be almost impossible to run a trial. Unfortunately it all goes downhill from there. It wouldn't be surprising if coffee had an effect on cardiovascular mortality, as caffeine is a stimulant, but they didn't find that, they found an effect for all cause mortality, but then they didn't look at what particular form of mortality.

From the paper there are obvious differences in the coffee groups, which indicates that there are probably employment, educational and lifestyle differences, which they don't have as covariates. Several of their covariates aren't very accurate. Physical activity, alcohol consumption and smoking deserve more than a binary. Year of entry to the study can also be an important confounder. There are probably other things. I dislike categorising things unless necessary, so using actual cups of coffee as a covariate would be my preference, although lots of medical journals seem to be OK with the idea that at 28 cups per week people suddenly start dying.

One thing that will amuse anyone with a good knowledge of survival analysis is "proportional
hazards assumption was tested by Martingale-based residuals". I hope not.

Jon Peltier

My own quick and admittedly dirty analysis:
I can draw a single horizontal line that passes through all ten sets of error bars in that chart.
This tells me the effect isn't very strong or particularly significant, despite any patterns I may think I see.


I currently drink about 11 cups of coffee a week. I should increase this to 18 cups if I want to live!

The comments to this entry are closed.

Get new posts by email:
Kaiser Fung. Business analytics and data visualization expert. Author and Speaker.
Visit my website. Follow my Twitter. See my articles at Daily Beast, 538, HBR, Wired.

See my Youtube and Flickr.


  • only in Big Data
Numbers Rule Your World:
Amazon - Barnes&Noble

Amazon - Barnes&Noble

Junk Charts Blog

Link to junkcharts

Graphics design by Amanda Lee

Next Events

Jan: 10 NYPL Data Science Careers Talk, New York, NY

Past Events

Aug: 15 NYPL Analytics Resume Review Workshop, New York, NY

Apr: 2 Data Visualization Seminar, Pasadena, CA

Mar: 30 ASA DataFest, New York, NY

See more here

Principal Analytics Prep

Link to Principal Analytics Prep