Apr 08, 2007

Peripherals 1

Like any technology, charts also come with peripherals: I'm talking about legends, data labels, grid-lines and so on.  These things typically give us the most trouble, especially with complex data sets.  The analogy is apt: one may feel inextricably knotted up like bunches of cords and wires.

Interactive graphics is a particularly elegant solution to this problem, and Google Finance has done a fantastic job leading the way.  One trick is to show the legend only when the user asks for it. 
Google_sectorsum_lgUsing bar charts (on the left), Google summarizes neatly the performance of stocks within each industry sector.  The bar chart gives a sense of the dispersion which adds to the average returns printed next to them.  For example, most sectors gained on average but then about 30% of the individual stocks in most sectors actually declined on that day.  So the fact that technology stocks gained 0.48% on average doesn't necessarily mean that the two tech stocks you own gained 0.48% or gained at all.

Typically, we would put a legend on the side or at the bottom of the chart, which all be told, is an ugly duckling next to a well-executed chart.  Here, the legend is hidden behind the "What's this?" link.  The side benefit is that the legend can be as verbose as needed since it doesn't interfere with the chart.

There are a few minor things to consider:

  • "What's this?" is not very informative: Why not call it a "legend" or "key"?
  • The graph designer seems to think that the most important information sought by readers was the extremes, i.e. the percentage of stocks that gained/lost more than 2%.  By darkening the sides of the bar, it draws attention away from the middle which is the boundary between the gainers and the losers.  I'd like to see that boundary delineated.
  • Similar to the above point, I'd sketch out a version which aligns the gainer/loser boundary to the middle so it's easy to see the balance between gainers and losers.  This version however would require more space
  • I'd provide sorting by average return, and by percentage of gainers

Mar 27, 2007

Illusory disparity

The WSJ published a chart with the cheeky title of "Rich Get Richer" (reminiscent of the Economist).  The underlying data concerned one-, three- and ten-year returns for the buyout fund category.  For each return class, the overall mean and the means for the top and bottom 25% funds were depicted.

I won't go into the relevance of the title as I simply could not figure out how it connected with the data.  The following shows the original chart side by side with the junkart version.

Redo_richgetricher

Improvements include:

  • Lines show the comparisons with a minimum of fuss compared with colored bars
  • The overall mean return is placed in the middle of each line segment where it belongs, instead of being the first column
  • The axis label, "annualized return", tells readers what is the performance measure
  • Adding the word "funds" to "top quartile" and "bottom quartile" removes the possible confusion that those represent individual returns of the funds ranked at 25th and 75th percentiles, rather than the average returns of the bottom 25% and top 25% of funds
  • The linear construct paints the correct picture that individual fund returns fall into a continuum

(Thanks to my students for some of these points.)

Reference: Wall Street Journal, Mar 3-4 2007.

Mar 08, 2007

Criminal chart

The Times found a sharp surge in violent crimes.

Nyt_crime


Uh-oh. 

The legend for the columns is missing.

The maximum murder rate of about 45 per 100,000 in the top chart is depicted by a column 9x as tall as that showing the minimum rate of about 60 per 100,000 of aggravated assaults in the bottom chart.

Sorting by murder rate does disservice to the bottom chart, rendering it essentially unreadable.

Reference: "Violent Crime in Cities Shows Sharp Surge", New York Times, March 9 2007.

Jan 16, 2007

Subjectivity

Irwebfeature_1 When I look at charts like this one, I ponder: Should graph designers adopt "objectivity" as practiced by American journalists?

Is it even possible to make "objective" charts?  Every design choice we make seem to chip away some of the detachment.  In this chart, the choice to order important web-site features by shopper -- rather than merchant -- ratings is a tacit preference for those ratings.  Bringing out key messages in the data is a subjective act, isn't it?

Are "objective" charts useful?  In our example, the design choices are kept to a minimum, and so it seems is its usefulness.  In comparing shopper and merchant ratings, one would be most interested in identifying the most effective web-site features as well as those features offered by merchants that find little resonance with shoppers-users.  These questions are better addressed by directly plotting the average rank and the ranking gap between merchants and shoppers (see below).

Notice that I said "ranking" rather than "rating".  The footnote discloses that the ratings were obtained from two different surveys conducted by two different companies at two different times.  How should we interpret the difference of 13% between the 89% of shoppers rating "Free Shipping" "very to extremely helpful" and the 76% of merchants rating "Free Shipping" "somewhat to very valuable"?

RedowebfeatureIn the junkart chart, we can focus on three groups of features:

  • the three top features ("Promo Discounts", "Free Shipping" and "Keyword Search") which attained the best average rank and least ranking gap;
  • the three "orphan" features ("Recommended Products", "Top Sellers", "Gift Selection") created by loving web-site producers, abandoned by independent-minded shoppers;
  • the three "neglected stepchildren" ("Shop the Catalog", "Store Locator", "Product Comparison") whose importance to shoppers were vastly underestimated by the merchants.

Unfortunately, while being "objective",  the data table fails to point out anything of interest to the reader.

Reference: "Consumers want one thing -- merchants are delivering another", Internet Retailer, Jan 2007.

Sep 19, 2006

Jamming

Econ_muslimsReaders may have noticed that I'm not a fan of the graphics aesthetics of the Economist.  (I love their subtle sarcasm, a way of saying something without saying it.  For example, the title of this chart is "where they are".  They let us read any meaning into the word "they".  As for their charts, I have taken issue on several occasions.)

This particular example uses one of their standard formats, stacked bars with an extra data series tagged on the right, its boxed annotation calling attention to itself.  It's a case of too much apparatus for a simple task.

The chart's purpose is to show that the US and France have the largest Muslim populations by numbers while France is by far the top country by percentage.

Redo_muslimsOur junkart version is very much cleaner.  Line segments indicating the low, mid and high estimates replaced the stacked bars (which falsely imply significance in adding the low and high estimates).  As usual, the minimum of gridlines and axes is used.  Instead of jamming two ideas onto one chart, if percentages are more important, then a separate chart should be produced, now ordered by decreasing percentages (see below).

The most crucial improvement is the fine print.  Perhaps extending their subtle sarcasm too far, the chart maker omitted context for interpreting the data: namely, that the low-mid-high range represents estimates by up to 5 different sources, each using potentially different methodologies for estimation.  This partially explains the huge variance in estimates for the US (or does it?).

Redo2_muslimsAlso missing is a comment on why these particular 6 countries were selected.  It may give a misleading picture of "where they are" in the context of world population.

Reference: "Where They Are", Economist, June 2006.

 

Jul 31, 2006

Enigma of the big-buck pitcher

A data table accompanied a recent NYT article pointing out that big-buck pitchers were far from sure wins for those clubs who have taken Scott Boras' pitches.  The table contains a wealth of data but very little information is immediately revealed to the reader.

Nyt_bigcontracts


Sorting by size of contract makes no sense, especially since the key metric of success, i.e. change in winning percentage pre- and post-contract, cannot be discerned without pulling out a calculator.  Further, once the contract size is expressed by dollars per season, it is clear that all these contracts fall into the same range (about $10-13 million per year).

BigcontractsOne graphical alternative is shown on the right.  It brings out the desired message, that big-buck pitchers may or may not perform after signing big-buck contracts.  Several pitchers are annotated as these have improved or declined by more than 200 points.

A graph cannot hope to achieve the data density of a data table.  But the process of making a graph forces the designer to focus on the most important data, which itself has great benefits.

Reference: "Big-buck pitchers are often big busts", New York Times, July 16, 2006.

Mentions


  • My Amazon.com Wish List

  • Yahoo! Picks

Search Junk Charts


  • Custom Search

Residues

July 2008

Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31