« The language of data, the math of retail | Main | Between precaution and fear »


Feed You can follow this conversation by subscribing to the comment feed for this post.




I have to disagree a bit here. I really like your explanations but I do not think it is really one or the other - this explanation OR the computations with A, B etc.

An intro class is for a wide population of students. Some of these students will be using stats in their other courses quite a bit (and likely often in their working lives). Others are simply fulfilling a requirement and only really "need" the statistical literacy part.

Teaching to both audiences requires a good prof to go over the thinking behind the numbers AND teach the computations. You may say a student can find that in the text, which is true, but many have trouble understanding the textbook's explanation of these computations.

Mike Anderson

I hadn't thought about emphasizing NPV, but that's a great idea for demonstrating tradeoffs. I also have my stats students calculate PPV as a function of disease prevalence so they can see how it varies among risk groups.


I completely agree that your description ends up teaching poeple more about conditional probabilities... but...

From the "curious" readers point of view (opposed to the "intrested" reader) Storgatz opinion peice is a much easier read. Why?

Because he doesnt get into technicalities or new acronyms. He just shows the curious something they probably were doing badly... And that makes a much better read to most of his readers (and probably the editor as well, IMHO).

Now if what the average NYT reader prefers to read is what is best for him to read? that's a whole diferent question.


we need to know the "negative predictive value", that is, the chance that one does not have breast cancer given that one has a negative mammogram.

The NPV is 0.11%. That is to say, for those testing negative, they can be almost sure that they don't have cancer.

Sure that this is not a mistake? Maybe NPV is 100%-0.11%?


Tic: Great catch! I do mean NPV = 99.9%. The computation is:

P(test negative) = 0.8% x 10% + 99.2% x 93% = 92.336%
P(true negative / test negative) = P(true negative tests negative) / P(test negative) = 99.2% x 93% /92.336% = 99.9%


JW: in my view, there should be two tracks for introductory classes, one to prepare students for hands-on statisical analyses, the other to prepare students to practice statistical thinking in their everyday and/or work lives, accepting the fact that they would never do any hands-on work.

If we treat the amount of class time as a scarce resource, then there is a tradeoff between time spent teaching formulas and time spent teaching interpretation and reasoning. Unfortunately, it's a tough choice.

Tom West

The usefulness of a mammogram depends on what happens *after* the test. I assume the results show up some suspicously cancer-like tissue. If someone has a positive test, then the best bet would an additional, more precise test (such as a biopsy), to determine whether or not the person actually has cancer.
The big advantage of mammogram is that it is non-invasive, and have very few negative effects on the patient. So, patients have little to loose.

Any general screening program should have low costs, low risks to the patient, and a low false negative rate. A high flase positive rate or low predictive value is an acceptable price to pay, *providing* that any positive result is followed up with a more precise test.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.


Post a comment

Your Information

(Name is required. Email address will not be displayed with the comment.)


Link to Principal Analytics Prep

See our curriculum, instructors. Apply.
Business analytics and data visualization expert. Author and Speaker. Founder of Principal Analytics Prep, MS Applied Analytics at Columbia. See my full bio.

Next Events

July: 24 Data Analytics Resume Workshop, NYC

July: 30 Joint Statistical Meetings, Vancouver

Aug: 28 Swiss Statistics Meeting, Zurich

Sep: 6 Data Visualization Seminar, San Diego, CA

Sep: 12 NYPL Analytics Careers Talk, NYC

Past Events

See here

Future Courses (New York)

Summer: Statistical Reasoning & Numbersense, Principal Analytics Prep (4 weeks)

Summer: Applied Analytics Frameworks & Methods, Columbia (6 weeks)

Junk Charts Blog

Link to junkcharts

Graphics design by Amanda Lee


  • only in Big Data