« May 2006 | Main | July 2006 »

Perspective

Perspective is a key concern in graphs, as in other visual arts.  Mixing multiple perspectives on the same chart can be disorienting, as in this recent WSJ chart depicting the sharp drop in stock market values across the world in May 2006:

Wsjinflationsm

Despite its complicated appearance, the entire graph can be constructed using only 20 numbers (the stock market value in May and in June for each of the 10 countries).  The loss in value is offered in three perspectives: as a percentage (hanging bar); as a number (list); and as a change in square area (inverted L-shaped bit of the square).  Needless to say, such redundancy only serves to confuse readers.  More harmful is the square representation but I'll leave that for another day.

Redoinflation1The message can be more clearly conveyed by focusing on one and only one perspective.  Our first reconstruction stresses the absolute value lost in dollars rather than the percentage.  From this perspective, the drop in US stock prices was the most severe, despite the lower percentage.  It is also evident how much larger is the US market cap than the rest of the world.  Consequently, the plunge in India had little impact on the global scene.

Redoinflation2_1Alternatively, if our key message is the relative drop in stock market values, then we should take a different perspective.  Here, I merely re-created the above chart with a log scale for the vertical axis.  Based on the slopes of the line segments, one figures that India and Brazil suffered much more than the US on a percentage basis.  Notice that from this perspective, the size of the US market cap does not jump out at the reader.

As preparatory work, the graphic designer should always test different perspectives and pick one that brings out her key point.

Reference: "As Inflation Signs Grow, Divergent Reactions", Wall Street Journal, June 15, 2006.


PS. Posting frequency will drop or vanish in the next two weeks depending on my vacation.


Illusion of success

A columnist in the NYT asserted that a "little-known market timing model" has had an "impressive track record" over the "long run".  The following graph was offered:
Nytmarkettime_1

This is an illusion of success entwined in an illusion of charting. 

First on the charting: as discussed before, charts with double axes, each bearing different scales, leave the door wide open for manipulation.  I haven't studied their model and thus do not know if there is a one-to-one relationship between their "scale" and the return differential between the S&P and T-bills.  My suspicion is that someone superimposed the two time series, shifting them up and down to minimize the discrepancy.  How else does one decide how to align the two vertical scales?  (This was exactly the issue with Friedman's Petropolitics Law.)

Now, the smell of success.  Meticulous readers will notice large gaps between the two lines since roughly the late 80s.  Is it a surprise then that the model was constructed using data through 1989 and published in 1993?  The columnist noticed, noting that "the model has had a mixed record in the 13 years since their study appeared."

And yet, he proffered this defence:

Even after the incorrect forecasts in the 1990's are taken into account, the model's overall record is good enough to be statistically meaningful and not likely to be mere luck.

His idea of "overall record" was to study 1968 through 2006, which includes 22 years "covered in the professors' original study" and 17 years since.

To put it simply, he took the 13 years of poor performance, and mixed in 26 other years, of which 22 were used in constructing the original model.  This is equivalent to averaging training accuracy and validation accuracy.  (Training data is used to build a model; validation data is used to test its performance on previously unseen data.)  Training accuracy is always high or else the modeler would have rejected the model; thus, no honest one would make claims based on training accuracy.  Measured on validation data alone, this model is most definitely foul, as the right-hand-side of the graph vividly attests.

Here, we have seen an illusion of charting used to prop up an illusion of success.

And one more question, even if this model is acceptable, how is one to take advantage?  Such models have little utility unless one also has a sane investment strategy.  Having a model and having a useful model are worlds apart.

Reference: "An Old Formula That Points to New Worry", New York Times, June 18 2006.


Choke points

The following map highlights in color the stretches of major roads in the New York area that experience congestion.
Nytchokepoints

A couple of improvements can be made:

  • The smaller roads should be left out completely rather than dimmed; as noted in the text box, congestion on smaller roads is ignored
  • More than two colors will bring out the difference between different roads better; as it stands, it is hard to see that 78 is the most congested.  With more colors, the annotation would be rendered unnecessary

Reference: "The Next Thing in Tolls?", New York Times, June 14, 2006


History Shots Website

Ivo, a reader, sent me to this interesting site: History Shots.  It sells prints of visualizations of complex time-series data in a variety of topics, including this chart of PGA players: Pgachart

It is hard to judge the success of these charts as we are not allowed to see the details.  They certainly provide a feast to the eye.  The inspiration to these charts is undoubtedly Minard's Napoleon chart, so praised by Tufte and others (see, for example, here).

In general, dense charts of this type service specialists really well but will  likely confound the general public.  The reader must spend time to learn the chart features before understanding the data.



Animation

Isaac, a reader, sent a very useful link to Man Investment's web brochure, which happens to utilize the same data table we discussed in an earlier post on comparing fund returns across years.  As he explained:

my colleagues and I attempted to get round this problem [Ed: of too much data and color in too little space] by using interactivity to selectively display the information in this style comparison tool

ManstylecompareFor illustration, the image on the left shows what happens when I mouse over one of the black cells: it subdues all the other fund categories, highlighting only  hedge fund index (black cells). 

Successively mousing over different fund categories helps untangle the mess of colors -- to some extent.  Manstylecompare2It is still difficult to notice patterns under this setup, e.g. how should the reader judge the relative merits of the black cells and the aquamarine cells?

The use of this table, animation notwithstanding, draws attention to the relative rankings of the fund returns.  The use of equal-sized cells presupposes, incorrectly, that the difference between any two adjacent cells is constant.  Examining the return numbers clearly shows this to not hold.  In fact, a rank #1 fund in one year could have return much lower than a rank #5 fund in another year.

ManworldcompareThis next chart compares the hedge fund index to "world stocks".  It points to another challenge of chart-making, that of consistency.  Here, 2001-3 appeared to be amazing years for holding the hedge fund index although in the first chart, those same years had the hedge fund index in the middling ranks.  (You may notice that the animation had a glitch here so that the bars are floating above 0%.  On a recent visit to the site, I found out that they have fixed this error.)


Google trends 2

My previous post about Google Trends elicited some thoughtful responses.  I examined this chart that first appeared on the Sullivan blog:
Times_v_blogs776285_1

The chart showed "blog" as the impressive run-away winner in terms of search volume on Google.  I commented on the lack of a vertical scale, which is essential whenever we want to compare several lines.

The following chart compares NYT and the Wall Street Journal.  Based on this chart, one would conclude that the New York Times enjoys continued strength against one of its fiercist rivals.  Notice that the same line (for New York Times) appeared as a big loser in the chart above but a winner in this second chart.
Trendsnytwsj_1

Now, extracting the line for "blog" and pitching it against "iPod", I get the following chart:
Trendsblogipod

All of a sudden, we don't see the dramatic rise of the "blog" line anymore.  Instead, we notice a close mirroring of search volumes, with iPod searches coming ahead during several periods in time.

Are any of these conclusions valid?  It is hard to tell.  It is hard to tell without the vertical scale.  We can only conclude that our perception of volume growth is conditioned by the specific sets of lines being plotted.  Providing any scale, even an indexed volume measure, would vastly help us interpret these charts properly!