Feb 12, 2007

Horrid stuff

Ec_smoke Small multiples can work wonders when data are replicated, as in this case.  The chart accompanied an Economist article on pollution levels in several European cities, as indicated by the concentration of nitrogen dioxide and particulates.

In the junkart version, I plotted the data series side by side, rather than one over the other.  Further, the order of cities was according to decreasing levels of NO2, which seemed to be the worse pollutant.  All gridlines are removed except the 30 line which worked pretty well to separate out the highly polluted cities.

Redopollutant An odd pattern has now surfaced.  Namely, there is some degree of negative correlation between the concentration of the two pollutants.  Environmental scientists may be able to tell us why.


Reference: "The Big Smoke", Economist, Feb 3 2007.

Feb 01, 2007

Error spotting

My friend Augustine pointed me to this interesting graph showing the time of sunset over the course of a year.  (The original author's write-up is here.)

Flickr_sunset

Of course, one can produce a perfect chart by looking up meterological records.  The main interest in this graph is how it was constructed.  Each cell in the graph represents an hour of a day, with days running across and time running down. The cells that are not dark each contain a photograph of the sunset contributed to Flickr, the photo-sharing site.  So this is in effect a graph created through mass collaboration (about 35,000 photos).

The "white" band roughly indicates the sunset.  What intrigues me is the variability... what are the reasons for lighted cells appearing all over the graph?

Some ideas include:

  • Different time zones
  • Incorrect time setting by some photographers
  • Erroneous tagging of photos as "sunset"

Jan 17, 2007

Losing count of Doomsday

The Doomsday Clock is making the news today: because of the  growing nuclear threat and continued denial of global warming, scientists say we are "five minutes from Doomsday".

Nyt_doomsdayclock This graph traces the movement of the clock's hand over the last few decades.  (I think it appeared on the New York Times website but I cannot find it now.)

The little tickmarks are superfluous, and the thin white borders between red columns serve only to make us dizzy.
As shown below, a line chart is much easier on the eyes.







Redo_doomsday Now, a question for the scientists: Why the clock analogy?  Does it reflect a kind of fatalism that we can never be more than 60 minutes away from Armageddon?  How many minutes were we from Doomsday two hours ago?

Dec 06, 2006

Mid-week entertainment

A contribution from a regular reader: apocalyse U.S. News style!

Apocalypse


According to the scientists, super-volcanoes and viral pandemics are our biggest threats.


Dec 05, 2006

Time travel

Cambridge_traveltime_web

One of my scientific heroes and seminal teachers is Professor Frank Kelly at Cambridge.  What a pleasant surprise to see his involvement in a data visualization project.  To cite his wise words:

The travel-time maps are more than just pretty to look at; they also demonstrate an innovative way to use and present existing data. We are entering a world where we have access to vast quantities of data, and ways of turning that data into information, often involving clever ideas about visualisation, are becoming more and more important in science, government and our daily lives.

The little black dot near the center of the map indicates the Mathematics building at Cambridge.  The contours (vaguely visible at our scale) represent intervals of 10 minutes by public transportation away from the black dot.  Any colored dot on the map refers to the time at which a traveller must leave in order to get to the Math building by 9 am, taking into account traffic situation, time of day, and decisions.  The hope of such maps is to help commuters (by public transit) plan their travel.

Professor Kelly has a very nice write-up on the intricacy of generating the data for such a map, which includes techniques of sampling, smoothing, extrapolation and so on.  It is rare that we get insights into the chart-making process.  He also carries a larger version of the travel-time map.

A similar article can be found at Plus magazine.

Dec 01, 2006

Smoking-Screening

Smokeathome2

Behind the smokescreen lies the informative conclusion: among households with smokers, about 40% smoke in residence all the time while about half never smoke in residence.

This graphic, unfortunately chosen, contains many distractions from the main message, including:

  • the liberal sprinkling of colors
  • the inclusion of data for 1, 2, 3, 4, 5, 6 days, almost all of which were effectively zero
  • the redundant vertical scale, as all the data already appeared on the chart itself
  • the comparison of smokers to "total sample" (rather than non-smokers)
     

The last point merits special attention.  The total sample contains households with smokers as well as households without smokers. Any data from the total sample is a weighted average of these two types of households.  It is better to directly compare the two household types than to indirectly compare one type to the overall.

Further, households without smokers should be extremely likely to have no smoking in residence all week. 
And if most households have no smokers (76% of this sample), then the statistics of the total sample will mimic those of no-smoker households. That is to say, the total sample statistics do not add much to the analysis.  Our junkart version below corrects for this as well as other things.

Redo_smokeathomeOne of the key functions of a graph is data reduction, i.e. to aggregate data in such a way as to expose the information contained within.  Typically, a graph that uses aggregated data is clearer and stronger than one that plots every piece of data.  In this example, by combining 1-6 days into a single category ("smokes in residence part of the week"), we have a graph that is much more readable.

I want to thank Dr. Mike Rabinoff for inspiring me to look up these second-hand smoking statistics.  Mike recently published a book called "Ending the Tobacco Holocaust", which tells you more than you want to know about the tobacco industry.


Reference: "Second Hand Smoke Survey: Final Report", Madison Department of Public Health, Dec 2003.

Nov 26, 2006

Wading in waste

Sciam_bacteria A poor graphic leaves readers wading in waste, in this case, the waste of time.  (Thanks to a tip from Dr. Bruce W.)

This very busy chart conveys a simple research finding, that the density of bacteria increases with the prevalence of impervious surfaces.  As Bruce pointed out, underlying this chart is but six observations taken at selected tidal creeks, each observation being a (paired) measurement of bacteria count and prevalence of impervious surfaces.

A factory worth of graphical elements was employed, including columns, pies, colors, data labels, legends and so on.  The result is utter confusion.  How is it that the tip of each column does not coincide with the center of each pie?  Do equal-sized pies imply equal surface areas?  What is the bacteria count at each location?

Redo_bacteriaA scatter plot brings out the key correlation with minimal fuss.










Reference: "Wading in Waste", Scientific American, June 2006

Nov 20, 2006

Flight of fancy

Wiredh5n1sm

The venerable Wired magazine has surely gone too far with this flight of fancy!  Consider:

  • The zig-zagging lines streaming across the map
  • The redundant white dots, each of equal size, contradicting the black dots, with size proportional to prevalence
  • The inexplicable use of 00, 01, 02, ...
  • The use of a taller column for human cases, when tallied, amounting  to about 1/20 the number for bird cases
  • The inclusion of Australia (with zero cases) while excluding the Americas (also zero cases)
  • Ordering the countries neither by bird nor human cases but by convenience of placement on the map

Redoh5n1As with a previous example, the map adds nothing to the data except for providing a lesson in geography.  We prefer a parallel bar chart, shown on the right.  Here, the continents are given different colors.  In an unusual move, I chose different scales for each side as I am more interested in the distribution among countries, rather than the relative prevalence of bird/human cases.

Reference: "Flight H5N1: Delayed", Wired Magazine, October 2006.

Oct 14, 2006

Racetrack entertainment

A warm welcome to readers of Science.  (Junk Charts is selected as "Best of the Web" this week.  Also thanks to Mitchell for the nice write-up.)

WiredgreenRacetrack graphs was a novelty item here some time ago.  They made an appearance in the October issue of Wired Magazine, known for its design.  We have already discussed information distortion in such charts.

This chart fails the self-sufficiency test, forcing readers to read and interpret the data labels, and to ignore the racetrack construct.

Graphical elements applied as cosmetics?  Charts sacrificing data integrity for entertainment?  This takes us back to our previous discussion: can good charts be entertaining?  Now flipped over: can entertaining charts be good?

Reference: "Good, Green Livin'", Wired Magazine, 10/2006.

Oct 09, 2006

Graphical equity 3

Zuil provides an alternative rendering of the Sankey diagram / flow chart.  This one is surely superior, being easier to understand while capturing more information than the previous example.

Govt_sankey2_1Ultimately, however, this type of chart will please specialists more than the general reader.

It is designed to be purely descriptive, which explains the absolute equality given to each flow, as indicated by the choice of unique colors and/or patterns for each.

As a data graphic, it can be  improved if the designer has a point to make.  In that situation, only the relevant flows can be highlighted while all others stay in the background.

As it stands, this chart murmurs but does not opine.

Reference: "U.S. Energy Flow - 2002", Energy & Environment Directorate, Lawrence Livermore National Laboratory.

Mentions


  • My Amazon.com Wish List

  • Yahoo! Picks

Search Junk Charts


  • Custom Search

Residues

July 2008

Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31