« What is seasonal adjustment and why is it used? | Main | A not-so-wonderful Bumps chart »



Chlorofluorocarbon is CFC. Those people are calling about recycling their old refrigerators.

I don't think the street light data are so obvious: why do people call about them so much more often between 10 and noon than at other times? Also, graffiti and dirty conditions both have two clear peaks at 2pm and 7pm. That is weird. There's several more of these one-hour peaks. I'm concerned about the way the data were collected and plotted!



I think you already phrased the most important issue: "no insights".

From a statistical point of view we need to ask what model do we expect behind the data. Are all issues people are calling in for more or less equally distributed and only the intensity changes over time? This is certainly too simple, as we already know that people will complain about noise more likely during nighttime.

That will lead us to a model that has certain *expected* intensities of complaints for certain times over the course of one day, estimated from a larger period of time.

To get insights of what is going on on a particular day, we then would need to plot the differences between the "model day" and the actual data.

This difference is something I keep on preaching to business people: "Don't be surprised by the data you look at, but be surprised by the deviation of that data from your expectation!" But for an expectation you need to have at least some kind of (naive) model ...


'The fact that complaints about "street lights" happen during the day is obvious.'

Not for me. I would have bet a lot that the opposite would have been true.

Sewer maintenance at 3:00 in the morning? Why would that be?

The graffiti complain at 7:00 pm probably makes sense because that is when people travelling back by train would spot and call to complain about graffiti. But 2:00 pm?

It may not be the best way to display this, but by getting this volume of information across in this format, it makes it easy to look for "deviation from ... [our] expectation" (to borrow Martin's phrase).


While I agree this data is fit for additional analysis, showing the raw data in this way is informative too. For example, who really knows the relative number of complaint x versus complaint y? Or the magnitude of the difference in night volume versus day volume? (Though as i type that last sentence I realize there is no indication of what the horizontal bars indicate!)


sorry, should have said that there are NO horizontal bars.


Zubin: this chart would be useful for exploration. It is very dangerous to hone in on little bumps and troughs because most of those would be random noise. What the chart designer could have done is to investigate the interesting bits, determine if they are "statistically significant" and then present a chart that draws attention to the real information, and hides the random noise.

The comments to this entry are closed.

Kaiser Fung. Business analytics and data visualization expert. Author and Speaker.
Visit my website. Follow my Twitter. See my articles at Daily Beast, 538, HBR.

See my Youtube and Flickr.

Book Blog

Link to junkcharts

Graphics design by Amanda Lee

The Read

Keep in Touch

follow me on Twitter