« What is seasonal adjustment and why is it used? | Main | A not-so-wonderful Bumps chart »

Comments

Cris

Chlorofluorocarbon is CFC. Those people are calling about recycling their old refrigerators.

I don't think the street light data are so obvious: why do people call about them so much more often between 10 and noon than at other times? Also, graffiti and dirty conditions both have two clear peaks at 2pm and 7pm. That is weird. There's several more of these one-hour peaks. I'm concerned about the way the data were collected and plotted!

Martin

Kaiser,

I think you already phrased the most important issue: "no insights".

From a statistical point of view we need to ask what model do we expect behind the data. Are all issues people are calling in for more or less equally distributed and only the intensity changes over time? This is certainly too simple, as we already know that people will complain about noise more likely during nighttime.

That will lead us to a model that has certain *expected* intensities of complaints for certain times over the course of one day, estimated from a larger period of time.

To get insights of what is going on on a particular day, we then would need to plot the differences between the "model day" and the actual data.

This difference is something I keep on preaching to business people: "Don't be surprised by the data you look at, but be surprised by the deviation of that data from your expectation!" But for an expectation you need to have at least some kind of (naive) model ...

bv

'The fact that complaints about "street lights" happen during the day is obvious.'

Not for me. I would have bet a lot that the opposite would have been true.

Sewer maintenance at 3:00 in the morning? Why would that be?

The graffiti complain at 7:00 pm probably makes sense because that is when people travelling back by train would spot and call to complain about graffiti. But 2:00 pm?

It may not be the best way to display this, but by getting this volume of information across in this format, it makes it easy to look for "deviation from ... [our] expectation" (to borrow Martin's phrase).

Zubin

While I agree this data is fit for additional analysis, showing the raw data in this way is informative too. For example, who really knows the relative number of complaint x versus complaint y? Or the magnitude of the difference in night volume versus day volume? (Though as i type that last sentence I realize there is no indication of what the horizontal bars indicate!)

Zubin

sorry, should have said that there are NO horizontal bars.

Kaiser

Zubin: this chart would be useful for exploration. It is very dangerous to hone in on little bumps and troughs because most of those would be random noise. What the chart designer could have done is to investigate the interesting bits, determine if they are "statistically significant" and then present a chart that draws attention to the real information, and hides the random noise.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Marketing analytics and data visualization expert. Author and Speaker. Currently at Vimeo and NYU. See my full bio.

Book Blog



Link to junkcharts

Graphics design by Amanda Lee

The Read



Good Books

Keep in Touch

follow me on Twitter

Residues