« Buffer time | Main | Sense of proportion »


TrackBack URL for this entry:

Listed below are links to weblogs that reference A challenge:

» Venn Diagram Challenge Summary 1 from Statistical Modeling, Causal Inference, and Social Science
The Venn Diagram Challenge which started with this entry has spurred exciting discussions atJunk Charts, EagerEyes.org, and at Perceptual edge. So I thought I will do my best to put them together in one piece. Outcomes people created can be... [Read More]



Hi Patrick, you did show which combinations gave negative diagnoses. Remember, every child got a yes/no out of every measure, so wherever you have a dot in your Venn diagram that shows two measures saying "yes", that's a "no" for the third. And likewise wherever there's a "yes" in your Venn diagram for one measure only, that's a "no" from the other two.

That's how I was able to get away with "only" four combinations, where Robert K has a tree of eight: mine is a tree of four times two, arranged as four combinations and their binary complements. By placing the complements back-to-back the way I did, you can see that, e.g., the more likely a measure is to give a positive diagnosis in dissent to the other two, the less likely it is to give a dissenting negative, and vice versa.

I assure you every number in my tornado chart comes only from the Venn diagram, from a data table I constructed using only the image Kaiser reproduced, before I found the paper. The paper just helped me with understanding what the language means. My tornado chart and your Venn diagram are completely mappable to each other.

As an example, look in the upper tornado chart in my first attempt, at the small light blue rectangle on the left of the bottom bar. That says that the Interview and Observation methods both said the child was not autistic in two cases (2), and they were wrong both times (100%). That's the same as saying the clinician alone said the child was autistic in two cases and was right 100%. And there is the same pair of numbers at the bottom left of your Venn diagram, the two dark dots.

Igor Carron


Yes I was wrong on this, but if the ASD include Autism and PDD-NOS then how come certain numbers on the right are larger than numbers on the left ? [I have updated my blog entry to reflect this correction]

I did not want to address this initially but Antony Unwin clarified the matter. It so happens that there are really five results for each kid:
-Clinician at age 2
-PL-ADOS at age 2
-ADI-R at age 2
-Best estimate at age 2
-Best estimate at age 9

the answer for each test can be either : Autism, PDD-NOS, Nonspectrum
or 3^5 = 243 possible combinations for each kid.

The figures that I presented display kids
in figure A with Best estimate at age 2 with Autism

in figure B with Best estimate at age 2 with ASD + Autism

the numbers outside reflects the kids for which the best estimates was not
- Autism
- ASD + Autism respectively.

Although unlikely, you could have kids positive on the 3 tests (clinician, adi, ados) and be counted outside the circles in figure A or B.

I think Robert is right in saying that the issue at hand is really trying to figure out how the best estimate is holding over time. In particular what test or combination of tests agree with best estimate at age 2 and then agree with best estimate at age 9.



The reason that e.g. the part of the Venn diagram showing that PL-ADOS alone showed 23 children with full autism, but only 7 with autism or some other ASD, is that PL-ADOS was no longer alone in its diagnosis in the second case. Similarly, PL-ADOS and ADI-R agree that 26 children are autistic at age two, but they are joined in their judgment by the clinician when it comes to broadening the diagnosis to ASD, in all but 21 cases.

It makes perfect sense that a non-unanimous diagnosis will be less frequent when you broaden the terms, because the diagnoses become more likely to be unanimous, and since the total of 172 children is fixed, that means the other fractions must decrease by the same amount. The only instance where the non-unanimous numbers actually increased was where both the clinician and ADI-R, (but not PL-ADOS) diagnosed only one child autistic, but another child, who only ADI-R had pronounced autistic, was agreed by ADI-R and clinician (but still not PL-ADOS) to be suffering from some form of autistic spectrum disorder. Perhaps this point should have been made by changing the shape of the second Venn diagram (B) to give it a fatter centre and thinner edges.

Meanwhile, I have made another version of my tornado chart, to highlight the 1-to-1 mapping between it and the original Venn diagram (and Patrick's graphical version). I trust this makes the source of my numbers clearer. Like Patrick, I'm running out of energy to do both versions, but the principle is the same: the tornado chart and the Venn diagram contain exactly the same numbers, presented in two graphically-different ways.

Andrew, my sympathies on feeling that Kaiser gets all the responses. It's probably because this is a specialist information graphics blog, which takes unclear or misleading graphs and tries to remake them. So your challenge is perfectly targeted to the readership of Junk Charts, while your own readers are probably a more diverse group of Bayesian statisticians and other statistics-interested laymen. If it makes you feel better, I'm subscribed to both blogs' RSS feeds :-)

Patrick Murphy

Igor, now I am REALLY confused. I thought we had 4 data points for each child: the three tests at age 2 and the "best estimate" at age 9. But now there is "best estimate" at age 2.

How is the age 2 best estimate performed? Is it derived from the three tests, or something independent?

If the former, then this muddies the chart's primary intent (as I understand it) to show which test or test-combination at age 2, is most accurate to predict the age 9 best estimate. In this case, only the three age 2 test results should be compared with the age 9 best estimate.

If there is also interest in how well the age 2 tests correlate with the age 2 best estimate, this could be a separate chart.

BTW, I am greatly enjoying Derek's reworking of the data to show, for each test, both the accurate predictions (dark blue dots) and the inaccurate ones (light blue). Although the original Venn may contain this data implicitly, seeing it Derek's way makes it much easier to understand.


There are some nice attempts by Murphy and Derek and I particularly like this one http://bernard.lebelle.free.fr/AutismData.png BUT after reading the comments I'm horribly confused about what the data represent.

This reinforces the general notion that Venn diagrams are not great at conveying info. But it also makes it impossible to really judge what sort of improvements we have on the original problem (and what message the different options convey).

Is there, somewhere, a definitive statement of what this data set means? (Maybe it's already up there, but I'm not convinced.)

This is fun, though. We should do this more often.


The paper's abstract has three parts that might address what it's all about:

"Objectives - To examine the stability of autism spectrum diagnoses made at ages 2 through 9 years and identify features that predicted later diagnosis."

"Results - Percentage agreement between best-estimate diagnoses at 2 and 9 years of age was 67, with a weighted ? of 0.72. Diagnostic change was primarily accounted for by movement from pervasive developmental disorder not otherwise specified to autism. Each measure at age 2 years was strongly prognostic for autism at age 9 years, with odds ratios of 6.6 for parent interview, 6.8 for observation, and 12.8 for clinical judgment. Once verbal IQ (P = .001) was taken into account at age 2 years, the ADI-R repetitive domain (P = .02) and the ADOS social (P = .05) and repetitive domains (P = .005) significantly predicted autism at age 9 years."

"Conclusions - Diagnostic stability at age 9 years was very high for autism at age 2 years and less strong for pervasive developmental disorder not otherwise specified. Judgment of experienced clinicians, trained on standard instruments, consistently added to information available from parent interview and standardized observation."

A well-designed graphic would back up these words of the authors in an intuitively-obvious way, although, to be fair, that Venn diagram was not a concluding graphic, but a scene-setting one.

Jon Peltier

I think the Venn approach is fine for a qualitative introductory view, but is not useful to try to present any quantitative information. It is too easy to be carried away by the colors of all the overlapping shapes. Patrick, your best attempt so far was the last, but it is improved further by removing the Venn-type legend.

Derek, I like your attempt with the light and dark blue, but the light blue extremes at the ends of the dark blue bars confuse me. Do they mean that the value should be adjusted left or right, inward or outward? Would it be better to have two adjacent bars, one for age 2 and one for age 9?

Robert, while your approach with the binary tree takes a moment to comprehend, I like it the most. (All approaches take a moment or two to comprehend; I doubt there is a way to present this information to make it instantly accessible.) My own unpublished cut at this data is similar, though I used a horizontal bar to allow longer category labels, and the tree structure was described through the labels, not using a tree. The tree is an improvement.


My later version had the light blue "diagnosis changed from 2 to 9" areas in the middle, which I hope takes care of any question of adjusting the value.

I have a few problems with Robert's graph. First, that it represents the unanimously negative diagnoses as having only a 14% success rate, when in fact they had only a 14% failure rate. As you would expect from unanimity, only 14% of children diagnosed as not autistic at age 2 went on to receive a best-estimate diagnosis of autism by age 9.

Second, that it represents only e.g. the two positive diagnoses made by the clinician where the clinician's diagnosis contradicted the other two measures. I would expect to see a measure of how well the clinician did on his own, regardless of the other two measures, rather than those in lone dissent only.

Finally, the measures for the clinician in dissent are only those in which he makes a positive diagnosis. Since this measure is one in which the clinician is the most cautious, it is not surprising that that caution is rewarded with a high success rate. But sending a child away with a diagnosis of "not autistic" or "not on the autistic spectrum", when the child later turns out to be autistic, is also a failure of diagnosis, and should be measured too.

So while Jon might take a moment to comprehend the tree, a more serious problem is that what he eventually comprehends is wrong. As I say, the ultimate junk chart is one that leads the viewer to conclude that which is not true. If good clear graphics leads to that result, so much the worse.

I have made a graph that I hope addresses these issues. This one is not, like my tornado charts, a straight re-telling of the Venn diagram numbers in a different form. Instead, it is an attempt to do what Robert was doing, which is to tell the conclusions that the Venn diagram numbers would come to, in terms of the stability, between age 2 and 9, of the three diagnosis measures and their combinations.

I included all the diagnoses made by a particular measure, instead of just the exclusive ones that were unshared by either of the two others. As such you will not see quite the same numbers, because they have been added up differently. They are still the same data, though. For instance, the 14% error rate for unanimously negative diagnosis can be seen at the bottom of the first graph, the green dot labelled "6 out of 42".

Note that where two measures agree, there is a lower error rate than when only one measure alone is counted, and that unanimity produces the fewest errors of all. This is entirely what you would expect, rather than that a single measure on its own should perform better than when the same measure agrees with another. Also, the clinician's low false positive rate is balanced by a high false negative rate, making the total error rate no better than the other two measures alone. It is definitely not the case that the clinician rocks and everything else sucks, and it would be a bad misreading of the data for a graph to imply such a thing.

Curiously, none of the measures or their combinations has a lower rate of false negative errors than false positives, although there is no reason why they should not. Is this a sign of too much caution in pronouncing a child autistic?


PS I too think the Venn diagram gets a bad rap as a table. There's no reason why a table should be arranged in rectangular grid form.


The Gelman blog has summarized discussions from this blog as well as others. Here.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.


Post a comment

Marketing analytics and data visualization expert. Author and Speaker. Currently at Vimeo and NYU. See my full bio.

Book Blog

Link to junkcharts

Graphics design by Amanda Lee

The Read

Good Books

Keep in Touch

follow me on Twitter