« Everything is possible, say some | Main | A poorly-told story adorned with data »


Feed You can follow this conversation by subscribing to the comment feed for this post.


Seems to me like it would be nice if the tests presented probabilities using Bayesian reasoning. Could even explain the reasoning! I don't think it would have to be too complicated for that.

Let people weigh the cost of false positives/negatives for themselves.


Direct-to-consumer SNP tests such as 23AndMe provide increased/decreased risk estimates as just that, posterior probabilities of risk given the customer's DNA. It's not correct to say 'any test, including these DNA test kits, can be tuned to allow more of the false positive errors, or more of the false negative errors. ' That's only true if the tests were saying "you have cancer" or "you won't get Alzheimer's." That is, if the risk estimates were being collapsed to binary predictions. But they're not saying that.

This isn't necessarily an argument that these tests should be sold to the public, or that the Times reported the story well. But your criticism is misleading in this particular respect.


HarlanH: Thanks for the comment; it makes me think a bit more about what I wrote, and also to read the Wiki entry on genetics testing. While it is not as straightforward as my sentence implies, I still think these DNA tests have errors. For example, the Wiki says "it is possible that the test missed a disease-causing genetic alteration because many tests cannot detect all genetic changes that can cause a particular disorder". Even the presence or absence of a given chemical is not likely to be error-free since trace amounts can always be found which would be interpreted as absence if under some threshold.

HarlanH/DavidC: I'm going to poke the Bayesian readers now. Specifying posterior probabilities is a way to skirt the problem but it doesn't solve the problem. In essence, you pass the ball to the end-user and it is now the decision-maker who makes the false positive or false negative errors. Someone who gets the DNA test result will want to make a decision on whether to seek medical help - that is a binary decision, and behind that decision is an implicit cutoff probability.


But behind that implicit cutoff probability are the costs of getting tested vs the costs of late diagnosis. These can depend both on the patient's preferences/resources and on the state of medicine for that particular disease; if you give the patient a +/-, you're implicitly including some assumptions about this information.


You must not be familiar with John Tierney's work, he's a big time libertarian and former Op-Ed columnist. I don't think much of Tierney's views most of the time, but he is technically writing a column in the science section so by journo standards he's allowed to express his opinion.
Tierney wiki page: http://goo.gl/5rM7k


@Kaiser: yeah, the user still has to decide what to do, but I generally have a knee-jerk view that informing people is good, and misleading them is bad. That's really the only reason for my comment.

Ebenezer Scrooge

Let me rephrase Zubin's comment less delicately:
In writing this column, John Tierney did not act as a ho. Instead, he acted as a freak.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.


Post a comment

Your Information

(Name is required. Email address will not be displayed with the comment.)

Marketing and advertising analytics expert. Author and Speaker. Currently at Vimeo and NYU. See my full bio.

Spring 2015 Courses (New York)

Jan 26: Business Analytics & Data Visualization (14 weeks) Info

Feb 23: Statistics for Management (10 weeks) Info

Mar 28: Careers in Business Analytics & Data Science (one-day seminar) Register

Apr 7: The Art of Data Visualization Workshop (6 weeks) Register

Next Events

Sep: 28 Data Visualization New York Meetup, New York, NY

Oct: 5 Andrew Gelman’s Statistical Communications class, Columbia University

Oct: 13 AQR ProSeminar, NYU Sociology

Oct: 22 Leading Business Change Through Analytics, Columbia Business School

Oct: 30 Ray Vella’s Designing Infographics class, NYU

Past Events

See here

Junk Charts Blog

Link to junkcharts

Graphics design by Amanda Lee


  • only in Big Data