« Raw data and the incurious | Main | More chart drama, and data aggregation »

Comments

Bob

I agree, the bubble chart might not be the best. IMO, the column chart is not a) visually pleasing and b) is marginalizes the trend the authors are trying to emphasize to about 4-6 lines in the histogram. Perhaps something more akin to a violin chart would be better?? This is a case where the individual data points are not the -point-.

Kaiser

Bob: Smoothing the data first will solve this problem. The "trend" here requires interpolation.

jlbriggs

I agree that this particular bar chart is not visually pleasing, and also that it doesn't quite drive the point home(I don't think the original made much of a point either though).

I don't agree that showing the point as "only" 4-6 lines in a histogram is marginalizing it. 4-6 lines in a histogram can mean a great deal.

Interpolation, or simply aggregation, will go a long way to fix that.

But I think the main point is that the bubbles, other than looking pretty, aren't really telling us much, and there are better ways to show it.

junkcharts

I had earlier wanted to link to Kosara's response to this blog post and Typepad's spam filter struck again, and removed a comment by the author of the blog!

Here is Kosara's take on this post: link.

As you can see from the above comments, I agree with him that smoothing helps.

When I retained the spikes, they were cued by the original chart in which certain days were highlighted. But I agree that those dates were probably not very meaningful.

RobMeekings

My first issue here is that the data doesn't directly talk about the number of infections, but rather reports the number of birds killed; without knowing more about the relationship between infection and culling we can't judge to what extent the one is a good proxy for the other.

I also think that looking at just the totals is misleading, it misses the fact that there are more interesting stories here than is captured by these aggregate charts.

For example, a brief study of the data reveals that there are two strains of the virus: H5N8 and H5N2; the first large culls (23 Jan and 12 Feb) are of flocks with H5N8, so could (should?) be omitted if the story is about H5N2.

Based on the reported species the vast majority (99.8%) of birds destroyed were chickens and turkeys of one sort or another; grouping these two together and disregarding the ducks, pheasant and other or mixed species, and looking at the timeline it appears that the virus appears to affect turkeys first, and chickens later.

This detail could lead to hypotheses about spread from the smaller (?) turkey population through mixed flocks to the large chicken flocks.

I don't have knowledge of poultry or virology, so don't know if these are valid concerns and hypotheses, but these are the sorts of stories I'd like to tease out from the data.

The comments to this entry are closed.

BOOTCAMP SUMMER '19



Link to Principal Analytics Prep

See our curriculum, instructors. Apply.
Kaiser Fung. Business analytics and data visualization expert. Author and Speaker.
Visit my website. Follow my Twitter. See my articles at Daily Beast, 538, HBR.

See my Youtube and Flickr.

Book Blog



Link to junkcharts

Graphics design by Amanda Lee

The Read



Keep in Touch

follow me on Twitter

Residues