« Why no serious researchers conduct "per protocol" analyses | Main | Other numbers from the colonoscopy trial »


Feed You can follow this conversation by subscribing to the comment feed for this post.

Godfree Roberts

Thank you for confirming my suspicion that biostatistics are the slipperiest of all, and the most abused.


Very informative and well presented analysis. I think most people would agree with your critique of disclosure and documentation. Especially since this kind of analysis seems generic enough to be distilled into a "best practice" R-script.

But I think there are good(?) reasons, why so little is done in this respect. Among them historical reasons, scientific paper(!) shelf-space was/is scarce, and reasons due to the sociological aspects of research production. E.g. rewarded is number of papers not amount of disclosure and maybe the fear of loosing the "secret sauce" recipe to competitors.


gg: since there are no limits to what one can put in the "supplementary appendix" these days, there is really no excuse to not providing the details. If "secret sauce" is a concern, then the material in question is not publishable - it belongs to business, not science.

Providing a script is important for reproducibility and is more precise in presenting details such as data processing steps but it also increases the burden on readers as it's not easy to read R scripts if you don't know R. Scripts also do not contain the model outputs.

This leads to the state of "peer review". How is it possible for reviewers to judge the validity of models (for statistial adjustments) when they haven't seen the model outputs? If I had to review this paper, without additional disclosure, my judgement merely reflects whether I believe the effectiveness is closer to 30% or 20%.

The comments to this entry are closed.

Get new posts by email:
Kaiser Fung. Business analytics and data visualization expert. Author and Speaker.
Visit my website. Follow my Twitter. See my articles at Daily Beast, 538, HBR, Wired.

See my Youtube and Flickr.


  • only in Big Data
Numbers Rule Your World:
Amazon - Barnes&Noble

Amazon - Barnes&Noble

Junk Charts Blog

Link to junkcharts

Graphics design by Amanda Lee

Next Events

Jan: 10 NYPL Data Science Careers Talk, New York, NY

Past Events

Aug: 15 NYPL Analytics Resume Review Workshop, New York, NY

Apr: 2 Data Visualization Seminar, Pasadena, CA

Mar: 30 ASA DataFest, New York, NY

See more here

Principal Analytics Prep

Link to Principal Analytics Prep