« Everything is possible, say some | Main | A poorly-told story adorned with data »


Feed You can follow this conversation by subscribing to the comment feed for this post.


Seems to me like it would be nice if the tests presented probabilities using Bayesian reasoning. Could even explain the reasoning! I don't think it would have to be too complicated for that.

Let people weigh the cost of false positives/negatives for themselves.


Direct-to-consumer SNP tests such as 23AndMe provide increased/decreased risk estimates as just that, posterior probabilities of risk given the customer's DNA. It's not correct to say 'any test, including these DNA test kits, can be tuned to allow more of the false positive errors, or more of the false negative errors. ' That's only true if the tests were saying "you have cancer" or "you won't get Alzheimer's." That is, if the risk estimates were being collapsed to binary predictions. But they're not saying that.

This isn't necessarily an argument that these tests should be sold to the public, or that the Times reported the story well. But your criticism is misleading in this particular respect.


HarlanH: Thanks for the comment; it makes me think a bit more about what I wrote, and also to read the Wiki entry on genetics testing. While it is not as straightforward as my sentence implies, I still think these DNA tests have errors. For example, the Wiki says "it is possible that the test missed a disease-causing genetic alteration because many tests cannot detect all genetic changes that can cause a particular disorder". Even the presence or absence of a given chemical is not likely to be error-free since trace amounts can always be found which would be interpreted as absence if under some threshold.

HarlanH/DavidC: I'm going to poke the Bayesian readers now. Specifying posterior probabilities is a way to skirt the problem but it doesn't solve the problem. In essence, you pass the ball to the end-user and it is now the decision-maker who makes the false positive or false negative errors. Someone who gets the DNA test result will want to make a decision on whether to seek medical help - that is a binary decision, and behind that decision is an implicit cutoff probability.


But behind that implicit cutoff probability are the costs of getting tested vs the costs of late diagnosis. These can depend both on the patient's preferences/resources and on the state of medicine for that particular disease; if you give the patient a +/-, you're implicitly including some assumptions about this information.


You must not be familiar with John Tierney's work, he's a big time libertarian and former Op-Ed columnist. I don't think much of Tierney's views most of the time, but he is technically writing a column in the science section so by journo standards he's allowed to express his opinion.
Tierney wiki page: http://goo.gl/5rM7k


@Kaiser: yeah, the user still has to decide what to do, but I generally have a knee-jerk view that informing people is good, and misleading them is bad. That's really the only reason for my comment.

Ebenezer Scrooge

Let me rephrase Zubin's comment less delicately:
In writing this column, John Tierney did not act as a ho. Instead, he acted as a freak.

The comments to this entry are closed.

Get new posts by email:
Kaiser Fung. Business analytics and data visualization expert. Author and Speaker.
Visit my website. Follow my Twitter. See my articles at Daily Beast, 538, HBR, Wired.

See my Youtube and Flickr.


  • only in Big Data
Numbers Rule Your World:
Amazon - Barnes&Noble

Amazon - Barnes&Noble

Junk Charts Blog

Link to junkcharts

Graphics design by Amanda Lee

Next Events

Jan: 10 NYPL Data Science Careers Talk, New York, NY

Past Events

Aug: 15 NYPL Analytics Resume Review Workshop, New York, NY

Apr: 2 Data Visualization Seminar, Pasadena, CA

Mar: 30 ASA DataFest, New York, NY

See more here

Principal Analytics Prep

Link to Principal Analytics Prep