« The right scale | Main | A budding field »



And they've got shadows with everything, and the ubiquitous massively-labelled scales.

Actually the year scale is the most hilarious. It consists of the same year repeated three times over before moving to the next.

Yihui Xie

Only a trivial suggestion: perhaps it's better to set the range of x-axis of the scatter plot to be [0%, 2.5%] too (the same as y-axis) so that people can easily know where is the real "diagonal line".


I agree that the relational plot is highly more explanatory than the time series, but presuming the divide chosen was the max for each individual city, don't we have to take the apparent message of the plot with a grain of salt, as the deflationary time series for some of these cities is a lot shorter (so far) than that for other cities? I'd also argue that the steepness of the decline relative to the incline is impossible to gage without a better definition of the start points (in the case of the inclining side) and end points (in the case of the declining side).


Martin: good points. With time series data, especially indices, knowing which time point was chosen as the reference level is very important; here, I didn't change the data, it's Jan 2000 = 100.

Note that the rates plotted are for compound monthly growth rates and so the length of the decline does not matter. One can grouse that we should model the curve (exponential decay, etc.); I'll just leave it to others to explore this avenue.

The precise phase definitions were Jan 2000 to peak month for inflationary phase, and from peak month to May 2008 (current) for deflationary.


Yihui: I usually prefer to square out the plot as suggested. I tried it here as well but didn't like the result; by doing this, the entire right half of the plot would be empty.


You may put your annotation on the right inside the plot so the space will not be wasted :-) (and draw a diagonal line from top-left to bottom-right)


I get your drift, but I read Iacono (the source of the chart) regularly, and he does the 20 color bit because the folk following the Case-Shiller index love, just love, to be able to contrast and compare the 20 cities all at the same time - imagine what happens when C-S widens it out to 30 or 40 cities....

P.S. - I'm sending him a link to this post - he'll get a kick!


I too would be love to be able to compare and contrast all 20 cities at the same time. The trouble is that the Excel default zillion-color scheme doesn't confer that ability. And it's not just an Excel default problem either; basically color choices that would genuinely allow the readers to do that are almost impossible for even the most skilled graph designer to arrange.

(I don't say completely impossible; I can see how a hierarchical hue-luminance set might make some sense of the pack. But the work hasn't been put in here)

Google for "William Cleveland", who did the early work on people's actual graphical abilities that showed that they weren't really able to see information beyond a certain complexity, if the means used to present that complexity were too far down a hierarchy that has color at the bottom, and area and angle near the bottom. (this is why pie charts are so bad for understanding, no matter how popular they are for their prettiness)

I wonder if there's a graphical equivalent of the Dunning-Kruger effect, where people who can't read a pretty colored graph don't know they can't, so they think it's a great graph? Not quite the same: I'm thinking of it as a property of the graph here, not of the people.

Jon Peltier has used complex collections of data to make an interactive chart that compares any two cases, against a background of all cases. This would work quite well for the Case-Schiller set

Hisham Abdel Maguid

Epic Systems together with Beemode (www.beemode.com) have developed a Data Visualization software "Trend Compass" almost ready to be released soon. It is an extension to Gapminder which was invented by a Swedish Professor. You can view it :

- www.gapminder.org

It is a new concept in viewing statistics and trends in an animated way. It could be used in presentation, analysis,research, decision making, etc.

Here are some links :
- Part of what we did with some Governmental institution:

- A project we did with Princeton University on US unemployment :

- April 2008 Media Monitoring on Cars TV ads (ad duration vs occurences over time) :

- Ads Monitoring on TV Sattelite Channels during April 2008. Pick Duration (Ads daily duration) vs Repeat (Ads repetition per day).

I hope you could evaluate it and give me your comments. So many ideas are there.

You can test the software by uploading data on our website and getting the corresponding Flash charts. This is for a limited number of users.


Eng. Hisham Abdel Maguid


Kaiser, which software do you use to prepare your charts?

If you have not do it before, a post about tools and software will be extremely valuable...


Stan Tyan

Data science is useless if you can’t communicate your findings to others, and visualizations are imperative if you’re speaking to a non-technical audience. If you come into a board room without presenting any visuals, you’re going to run out of work pretty soon.

More than that, visualizations are very helpful for data scientists themselves. Visual representations are much more intuitive to grasp than numerical abstractions. That’s just human nature, whether you’re a data scientist or not.

The comments to this entry are closed.


Link to Principal Analytics Prep

See our curriculum, instructors. Apply.
Kaiser Fung. Business analytics and data visualization expert. Author and Speaker.
Visit my website. Follow my Twitter. See my articles at Daily Beast, 538, HBR.

See my Youtube and Flickr.

Book Blog

Link to junkcharts

Graphics design by Amanda Lee

The Read

Keep in Touch

follow me on Twitter