If we report it, it's a fact

David Leonhardt wrote in the NYT of a shocking incident of statistical abuse committed by Lou Dobbs and the CNN crew.

On several recent occasions, while commenting on the red-hot immigration issue, Lou and company remarked that "there had been 7,000 cases of leprosy in this country over the previous three years, far more than in the past".  (Leprosy is a flesh-eating disease prevalent among immigrants, particularly of Asian or Latin American origin.)

Nyt_leprosyWhen asked about fact-checking, Lou reportedly said: "If we reported it, it's a fact."  A quick visit to the government's leprosy program web-site immediately reveals the time-series chart, shown on the left.  With annual rates at about 150 in the last 5 years or so, one is hard impressed to find the 7,000 alleged cases!

Furthermore, because this chart lacks comparability, we fail to see that 150 cases out of a population of 300 million represent a minuscule risk.

A slight downward trend is evident in the last 20 years or so; this record is even more impressive when we realize the population grew during this period.  These points can be made clearer in multivariate plots.

Source: "Truth, Fiction and Lou Dobbs", New York Times, May 30, 2007; U.S. National Hansen's Disease web-site.


Finding dots

Erik W. alerted me to this CNN map that shows FBI statistics about safety of American cities.  As Eric pointed out, this is prototypical of chartjunk a la Tufte.  A lot of ink is used to depict 12 points of data (top 3 cities in safety, crime, improvement and decline).

Cnn_safest Imagine the reader trying to find the 3rd most improved city.  She either has to find all the blue dots and then figure out which is #3; or she needs to find all the #3 dots and figure out which is blue.  As they say, it's "hard work".  In fact, finding the dots among the forest of large text is hard work by itself!

How would I re-make this chart?

  • Highlight only the states containing data (California, Michigan, Missouri, Ohio, Georgia, New Jersey, New York); gray out all other states and their boundaries
  • Separate the states from the cities; only write the State name once for each State; reduce the font size
  • Instead of dots, use numbers.  So the most dangerous city (St Louis) gets a red "1", Oakland gets a purple "3", etc.
  • Remove Mexico, Canada and water from the map

The map gives the false impression that crime is relevant only along the coasts and the lakes, when in fact, the map is just saying that most cities in the U.S. are located along the coasts and the lakes.  Using such a map to depict city-level statistics creates distortion because cities are not evenly distributed across America.

Beyond that, what is the point of this map?  Is it merely a geography class telling us where each city is located?  How is it better than a simple table listing the cities in order?   

Reference: "U.S. City Safety Rankings", CNN, 2006.