« Perhaps the Economist doesn't take its own advice | Main | Light entertainment: volume visuals »



if you think it needs cleaning up, clean it up! Errors on Wikipedia are like litter on a sidewalk. Clean it up.

Rick Wicklin

I disagree that "nothing is lost by smoothing." The smoothed chart, as you say, is superior for showing trends, but is inferior for faithfully representing the data. With the (unsmoothed) line chart, I can accurately determine the population growth for India in 1985. The smoothed chart underestimates that value, but I have no way of knowing that fact. Also, the smoothed version gives the erroneous impression that the population growth for Indonesia in 2010 was less than Brazil's, when in fact the opposite is true.

In general, I don't think that it is a good idea to replace a line plot with smoothers. I think that a graph should show the true data unless it explicitly indicates otherwise. When you substitute a model (=smoother) for the data, it can misrepresent the data. For example, a sharp drop off in population growth (due to a war or natural disaster) would not show in the smoothers. If you are going to show smoothers, superimpose them on a scatter plot of the data.

Lastly, it is important to note that that there is not a unique smoother for these data. Each smoother depends not only on the data but also on smoothing parameters. An unscrupulous analyst might choose a value for the smoothing parameter which shows aspects of the data that do not truly exist. Therefore, when you use smoothers it is best to specify the smoothing technique (cubic spline, loess,...) and how the fit was constructed.


I'm not terribly experienced with smoothing. Perhaps it's my ignorance, but my first thought was that it can't be true in this case that "nothing is lost by smoothing." While the point of the graph may not be to compare two countries, the chart does lend itself to doing so, and could result in false conclusions.

For example, try compare Pakistan and Nigeria. In the smoothed chart, it looks like for a good 30 years, Nigeria's growth rate outstripped Pakistan's. The other chart suggests there was actually quite a bit more variability. You also might get the impression from the smoothed chart that over the past 15 years Nigeria's growth rate has slowed much more dramatically than Pakistan's. The second chart suggests that is probably not the case.

The little smoothing I've done in the past has made me a little nervous, so it's a topic I'd be interesting in seeing discussed.

Chris Johnson

I also disagree that "nothing is lost by smoothing.". By smoothing the data so heavily, you reduce the amount of information that can be read from the graph from 120 pieces of real data (the growth rate in 10 countries times 12 years) to 10 pieces of inferred data (whether each country is in the green, black or blue categories).

What actually does the y-axis represent in these plots? I assumed it was actual population growth per year, but the values should be around 1.5% in this case, not 0.15%. Perhaps it is growth per month?

Google will display this data (for every year, rather than every five) in a rather nice interactive format: http://www.google.com/publicdata?ds=wb-wdi&ctype=l&strail=false&nselm=h&met_y=sp_pop_grow&scale_y=lin&ind_y=false&rdim=country&idim=country:USA:IND:IDN:BRA:PAK:BGD:NGA:JPN:RUS:CHN&tdim=true&tstart=-315619200000&tunit=Y&tlen=49&hl=en&dl=en

which also indicates where the data is from (http://data.worldbank.org/indicator/SP.POP.GROW?cid=GPD_2 - with sparklines!)

The comments to this entry are closed.

Kaiser Fung. Business analytics and data visualization expert. Author and Speaker.
Visit my website. Follow my Twitter. See my articles at Daily Beast, 538, HBR.

See my Youtube and Flickr.

Book Blog

Link to junkcharts

Graphics design by Amanda Lee

The Read

Keep in Touch

follow me on Twitter