Mar 01, 2008

Don't believe what you see

Mankiw's blog linked to a press release by the Congressman Jim Saxton, using CBO data to show "middle income tax burden at lowest level in decades".  Cbo_taxrateThe attached graph, as Junk Charts readers will immediately recognize, is classic chartjunk.  Every time the vertical axis does not start at zero,  one suspects something is amiss.  And what with the gridlines and data labels?

"Don't believe it? Check out the data source yourself."  I followed Mankiw's suggestion and was indeed surprised... but not by the great fortune of the "middle class".  The surprise was how the chart painted a dishonest picture of the CBO data.

The original chart plotted only the tax rate experienced by the middle 20% of the population. 
Redo_taxrate1The CBO provided data for all five quintiles; why not plot them all?  In this new chart (right), the "surprise" windfall to the middle 20% proved not to be anything special at all!  All five quintiles, especially the middle three, followed pretty much the same trend over time.  The effect of singling out the middle 20% is to deprive the context by which the data should be interpreted.

Further, what might be the result of the declining middle income tax burden?  Redo_taxrate3 The CBO data painted an unexpected picture.  Paradoxically, as the middle 20% see their tax rate decrease, they also earn a smaller share of the nation's after-tax income (black line at right).  At the same time, the top 1% saw their share of after-tax income double from about 8% to almost 16% (blue line).  The top 20% line is also upward-sloping although less pronounced.  So, the implication that the middle class have had it good is plainly wrong.

What is going on?  Two factors were at play and the Congressman presented
only one side of the story (the tax rate).  What he omitted was that during this period, the nation's wealthy took home larger and larger shares of the pre-tax income.  This shift in pre-tax income more than offset any relative reduction in tax rate for the middle 20%.

This distortion can be traced back to the use of quintiles (or more generally, ranks).  We use them to cope with data having extreme distributions but a by-product is losing information about how extreme are the extreme values.  As demonstrated here, the quintiles from old are really different from the quintiles from today because the underlying distribution has become much more extreme.

Finally, another bit of mystery (to me) is how the middle 20% came to be considered "middle class".  Is there a widely accepted definition?

Reference: "CBO Data Show Middle Income Debt Burden At Lowest Level in Decades", Feb 21 2008.

Sep 17, 2007

Structuring a chart

Nytmpg This chart from the NYT was intended to show how the EPA has moved the bar on vehicle mileage ratings: 2008 estimates were lower than 2007 estimates across the board, regardless of manufacturer, model and city/highway.

The chart was built from one basic component, repeated for each model. 
Nytmpgsm_2I like the discreet gridlines (the white ticks) which enable readers to count off the mileage ratings.

The data is rich: ratings were given along three dimensions (model, year of estimate and city/highway).  Readers can benefit from a stronger guidance in where to look for the most pertinent information.  As the chart stands, it is merely a container for the data.  It fails our self-sufficiency test: all the data were printed on the chart, and the bars add little.

In the junkart version, I use knowledge of the data to structure the chart. First, noting that sedans, hybrids and trucks/SUVs/minvans have different levels of mileage ratings, I clustered the models into three groups.  Secondly, the city and highway ratings were separated into two columns as I consider the between-model comparisons more important than city-highway comparisons. 
RedompgThe chart is a dot plot, with a vertical tick for 2007 estimates and a dot for 2008 estimates.  It's easy to see that all dots sit to the left of vertical ticks.

More subtly, we can also see that the hybrids appeared to have been penalized more.  Or perhaps, the higher the rating, the larger the downward adjustment...

Source: "Mileage Ratings Are Still Estimates, Though Closer to Reality", New York Times, Sept 16 2007.

Aug 15, 2007

Could-be-light entertainment

OnionIt's the heat of the summer so here's another entertaining contribution.  Mike K, a reader, helpfully points us to this chart from The Onion (a satirical paper).

The artist must know some best practices since he/she can get so many things wrong at once.  At least he/she can do math, the percentages do add up to 100.

Histograms are the second most popular chart, that's a surprise!

Source: "America's Most Popular Charts", The Onion, Jan 7, 2007.

Jul 09, 2007

Adulterated education

A good teacher makes a great difference.  Reader Richard M drove this point home when he sent in a junk chart posing as educational material. The offending graphic is used by BBC's Skillswise website to teach "Handling data: Graphs and Charts".  Skillswise is an otherwise laudable effort to help adults "improve their basic skills in reading, writing and maths".

Skillswise Even for pros, each question is a challenge.  Question 7 really requires a new pair of glasses.

The entire worksheet is located here.  The use of patterns for shading is especially disconcerting.  The graphic also lacks self-sufficiency as we have trouble comparing countries without referencing the underlying data.  As we discussed before, a good graphic is one in which graphical objects (bars, pies, dots, etc.) illuminate the underlying data; when all the data must be printed next to the objects, the graphic is most likely redundant.

Source: BBC Skillswise website.
 

Jun 26, 2007

Dizzy display

Wufoo Xan G. tells us that these "inconsistent pie charts ... make [his] head hurt".  The dizzy array of colors is unfortunate, especially when "Application" gets a medium blue in three of four pies but an orange-red in one of them.  Just like the baby names charts, it's important to keep the background constant when constructing small multiples.

We cite from the horse's mouth:

The goal of this section was to uncover any [software development] task that might be overlooked [by these startup companies]. When writing a software product, the tendency is to focus 100% on the application. Items like support, marketing, and especially billing never cross your mind.

The junkart version below is designed to bring out this one message: that Blinksale has distinguished itself from the rest by having spent more time developing code for purposes other than the application itself. Redo_wufoo 

I removed the raw counts of lines of code and focused only on the relative proportions.  The former does nothing to argue the author's case.

The pie charts fail our self-sufficiency test.  The reader must rely on the data table and data labels to understand the chart.  If removed, the key message is obscured.

Source: "Web App Autopsy", ParticleTree, June 2007.

Jun 17, 2007

Foreground, background

Derek C. points us to this effort by a science journalist to use graphs to help "clarify the concept of climate change".  The graph on the left shows that actual greenhouse gas emissions have exceeded the level predicted by the most pessimistic climate models.  The 3D bar chart on the right examines which countries had most increased emissions since 1990. Warming

While the bar chart contains many of Tufte's "ducks" (not sorted by percent change, 3D, color, gridlines, sufficiency, etc.), it's the left chart that can be made more powerful.  Redo_warming2

The casual observer does not need to know which model led to which trajectory of predictions; the graph is vastly simplified, and the message much clearer in the junkart version.  (I only included the CDIAC data because I didn't locate the EIA numbers.)

The general point here is recognizing what is foreground, and what is background.  Aside from gridlines, data labels, axis labels and so on, some of the data usually constitute background material, often as in this case being used to establish comparability.

One message I got out of this chart is that these climate models have done a good job!  (Now, I have no idea if part of the curve included the training period.  It is curious that the predictions were very narrowly contained in the early 1990s.)

Source: The Island of Doubt Blog, June 6, 2007.

Oct 14, 2006

Racetrack entertainment

A warm welcome to readers of Science.  (Junk Charts is selected as "Best of the Web" this week.  Also thanks to Mitchell for the nice write-up.)

WiredgreenRacetrack graphs was a novelty item here some time ago.  They made an appearance in the October issue of Wired Magazine, known for its design.  We have already discussed information distortion in such charts.

This chart fails the self-sufficiency test, forcing readers to read and interpret the data labels, and to ignore the racetrack construct.

Graphical elements applied as cosmetics?  Charts sacrificing data integrity for entertainment?  This takes us back to our previous discussion: can good charts be entertaining?  Now flipped over: can entertaining charts be good?

Reference: "Good, Green Livin'", Wired Magazine, 10/2006.

Mentions


  • My Amazon.com Wish List

  • Yahoo! Picks

Recent Comments

Search Junk Charts


  • Custom Search

Residues

May 2008

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31