Apr 25, 2007

Shower of bullets

Nyt_gundeaths_sm Here's one of those infographics that makes the reader work hard (via Dustin J).  The graphic in its full glory is here; it's much too large to be reproduced, and I have clipped off the bottom half.

Much to the designer's credit, he extracted data of interest, rather than trying to cram everything onto the page.  In particular, he was most interested in the distribution of deaths among different age groups, the types of deaths (suicides, homicides) and the identities of the deceased (race, gender).

Just like the election fraud graphic, such rich data lend themselves to multiple levels of aggregation.  Here, the designer focuses on the most detailed level, making it easiest to see facts like "among the 18-25 age group, there were 6 black men murdered per day".

However, it takes much more attention to notice higher-level facts like "homicides per day are relatively flat across age groups while suicides heavily skew toward 40+".

Redo_gundeaths_sm In the junkart version, I decided to emphasize the more aggregated data, showing the number of deaths of each type across age groups. The detailed break-down of race and gender is shoved into parentheses, as they can be omitted by less serious readers.

The reader who discovers that the homicide/suicide pattern described above may surmise that homicide gunfire deaths are more "random" while suicides, being  premeditated, may affect older people disproportionately.  More research would be needed to confirm such and other suspicions.

Source: "An Accounting of Daily Gun Deaths", New York Times, April 21 2007.

 

Feb 22, 2007

Bubbles of death 2

Here is an alternative way to present the death risk data.  It's a variation of Tukey's stem-and-leaf plot.  Instead of presenting the exact odds, I believe it is sufficient to generalize the data by grouping them into categories.  Not much is to be gained by knowing that the odds of dying from fire and smoke is 1 in 1113 as opposed to the odds being in the range 1 in 1000 to 1 in 10,000 and comparable to that of drowning, motorcycle accident, etc.

Redooddsdying


PS. Be sure to look at Derek's chart in the comments.

Feb 06, 2007

Digging it out

Tr_diggbgAnother sunset photo compilation?  Not quite.

This chart acts and smells like the sunset chart, being generated by many unknowing collaborators, this time, visitors to the content aggregation site, Digg.  For those unfamiliar, web browsers can "digg" any web page they find interesting (by clicking on an image), which causes a link to be generated at Digg's web-site.  We can use the number of Diggs to judge the value or popularity of a web page.

In effect, Digg is a gigantic save folder for the masses.  What happens when we have huge amounts of data?  We have to work really hard to dig out the useful information.  This chart goes quite a long way to answer one specific question.

Digg users are plotted horizontally and the stories they Digged are plotted vertically.  The bright white vertical strip represents suspicious activity; some user digged a large number of stories within the time window of the chart, most likely a bot trying to usurp the mass rating system.

Flickr and Digg are two of the more prominent stories of the so-called "Web 2.0", or mass collaboration on the Web.    Between my last post and this post, I have kind of lost enthusiasm for this type of charts, at least from a statistical perspective.  There is no real collaboration: the photographer who contributed sunset No. 103 does not know the one who uploaded No. 31, for example.  Using this logic, every survey or census ever conducted qualifies as mass collaboration, just because there are many participants providing data. 

What's worse, a typical survey brings together results from a random sample.  These charts all have highly biased samples, and I haven't seen any discussion yet of this issue.  They cannot be interpreted without understanding who participated.

Reference: "How Digg Combats Cheater", Technology Review, Jan 24, 2007.

Oct 14, 2006

Racetrack entertainment

A warm welcome to readers of Science.  (Junk Charts is selected as "Best of the Web" this week.  Also thanks to Mitchell for the nice write-up.)

WiredgreenRacetrack graphs was a novelty item here some time ago.  They made an appearance in the October issue of Wired Magazine, known for its design.  We have already discussed information distortion in such charts.

This chart fails the self-sufficiency test, forcing readers to read and interpret the data labels, and to ignore the racetrack construct.

Graphical elements applied as cosmetics?  Charts sacrificing data integrity for entertainment?  This takes us back to our previous discussion: can good charts be entertaining?  Now flipped over: can entertaining charts be good?

Reference: "Good, Green Livin'", Wired Magazine, 10/2006.

Mentions


  • My Amazon.com Wish List

  • Yahoo! Picks

Search Junk Charts


  • Custom Search

Residues

May 2008

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31