Finding a purpose
May 20, 2008
Sean C. asked a question that has been on my mind for some time: what do we think of treemaps? Are they too busy? Do they add to our understanding of the data?
Generally, this type of chart is pretty good for exploration but not so for communication. We can stare at the chart however long and still it isn't clear what the designer wanted to convey.
Sean used it to show the components of Australia's CPI and to explain the source of recent inflation.
It does present the hierarchical structure of the data in a compact way. It also provides information on both the relative importance of each factor, together with its growth rate, with little fuss.
That said, the differing sizes and orientations of the boxes makes it hard to compare their sizes. For example, is "home purchase" on the lower left larger than "financial services" in the middle? How about household services (blue on the top) relative to audio, visual and computing (blue on the lower right)?
Also note that the two data series do not carry equal weight: readers are likely drawn more by the box sizes than by color gradations (which do not convey relative values well); as a result, the composition of the CPI rather than the changes in the components will gain more attention.
If the purpose of the chart is to communicate findings, then a data table enhanced with colors and boxes can do a good job. There are other ways to utilize the tabular format, such as sparklines or other symbols in lieu of numbers.
That said, the treemap is more intriguing and inviting than a table of numbers.
I think treemaps are often busy because the data set they're presenting is often busy. As an information visualisation technique, they suffer from being on the bleeding edge of what it's possible to cram into the human brain via the eye, so it's fairly natural that they should lack the kerblam factor of a simple bar chart. To compare apples and apples, the treemap's critics should always use the same data set as the treemap itself when declaring what's better, as you have with the table.
Instead of horizontal lines setting off the categories, have you considered titles, and indentation of the subcategories? Text indentation is fairly straightforward in Excel, although sadly not available in Conditional Formatting. The way I handle it is to create styles, include a column of "Indentation Level" numbers as well as a column of "Sort when finished" numbers, then sort the table to apply the styles en bloc, before sorting the table again into its final order.
When centering numbers, be aware that negative signs or decimal points can shift the center, and design the number format to cope. Decimal alignment of centered or left-justified numbers is possible in Excel, as discussed here. BTW, that article needs to have the "Table" tag, to be picked up in the tag cloud on the right.
Posted by: derek | May 21, 2008 at 01:57 AM
Thanks for the feedback on the chart. My initial inspiration was this chart in the New York times. Although it looks beautiful, your comments about the difficulties inherent in a treemap certainly apply and I suspect that judging relative area of irregular shapes is even harder than comparing rectangles.
I should also note that I had fun using R to produce the chart. Once I've tidied up the code a little, I'm happy to share it with anyone who is interested.
Posted by: Sean Carmody | May 21, 2008 at 03:31 AM
I decided to do a table with a bit of infographics added in the form of bars. I'm not sure how best to include the percentage change, though.
By the way, did you notice your thumbnail was twice the file size of your full size image? :-)
Posted by: derek | May 21, 2008 at 03:33 PM
Perhaps a bar chart?
Option 1 - height is proportional to 'change' and width proportionoal to 'importance'.
Option 2 - height is proportional to 'importance', and colour is proportional to 'change'.
Either chart could be ordered by either 'change', or 'change' x 'importance' which would be total impact.
Posted by: Michael | May 21, 2008 at 08:13 PM
Neither displays above actually show the effect of each component on overall CPI change. For example, how does the effect of "dark red" Dairy compare to "light red" House Prices? I would want to see a third column in the table showing column1 x column 2.
Personally, I find the table above more difficult to read than the graphic! There must be a better way to present the data?
Posted by: Bob | May 21, 2008 at 11:59 PM
@derek I like your table. Perhaps you could also have the percentage change with bars that point right or left from a centre zero line for positive and negative respectively. Also, thanks for the note about the thumb size: I actually only have a single image (the full-size) and use Wordpress's resizing feature. Maybe it's not too good with pngs! When I get a chance, I'll substitute in my own thumb-nail image.
@Bob You make a good point about overall contribution. That's the problem with a colour-based quantification: while you can tell "hot" from "cold" there's no sense in which you can multiply colour and area!
Posted by: Sean Carmody | May 22, 2008 at 12:14 AM
By the way, in case anyone playing around with alternatives would like to have the actual data, I've posted it here.
Posted by: Sean Carmody | May 22, 2008 at 12:44 AM
Adding a category header for each section certainly helps, so does sorting the sections by total contribution.
Good point about multiplying the two columns to get the weighted contribution. Unclear how to illustrate it properly though...
Posted by: Kaiser | May 22, 2008 at 12:47 AM
Sorry, I meant Kaiser's thumbnail of the table is 34K, while the image it clicks through to is only 13K.
File size of images is a game I play with myself, to try to get the most information presented elegantly, in the smallest file. I recognise that in these days of broadband, a 100K file is almost as fast to download as a 10K one, so it's not practical to spend valuable time squeezing it down, but many people seem not to realise the positive effect that some simple quick actions can have. Abandoning JPGs for PNG or GIF is a particular hobby horse of mine: JPG is rarely the compactness winner for graphics.
I've recently worked out how to persuade Excel to produce crisp bitmap lettering instead of heavily anti-aliased blurry text, so I was pleased to be able to get this text-heavy table down to just under 9K :-)
Posted by: derek | May 22, 2008 at 03:05 AM
I propose this visualization:
http://services.alphaworks.ibm.com/manyeyes/view/SP9G~OsOtha6RjEEEvz1O2~
here the boxes are sized according to their impact on CPI : weight multiplied by change. this also means that subcategories where prices have remained constant or decreased are not shown.
still it's easier to see where the price increases are coming from: the big boxes are private motoring, financial services, house purchase and renting.
I agree with Derek's 1st comment. Treemaps are an interesting technique but are not well known enough to be understood by a lay audience without explanation. still they are one of the few techniques that can be used to display information about a large number of items, something that the bar chart is not good at.
Posted by: vozome | May 22, 2008 at 05:41 AM
@vozome Wow, that's impressive! I like it.
Also, I certainly agree with you and Derek that the treemap is not for the uninitiated. I recently used the inflation chart in some client presentations. Before the meetings, our salespeople were very concerned that the chart was too busy and confusing. However, using it as a focus for a discussion about what's been going on with inflation down under, it actually worked very well, but only because I was there as a guide!
Posted by: Sean Carmody | May 22, 2008 at 08:56 AM
out of curiosity Sean what kind of clients are these? (me: I look after graphs for the OECD)
Posted by: vozome | May 22, 2008 at 10:30 AM
I don't think the treemap is a difficult design, I think it's a design for an inherently difficult job.
Posted by: derek | May 22, 2008 at 10:48 AM
I picked up Seans idea with a graphical table and enriched it with some sparklines to visualize the trend. I don't have the complete historical data, so I just used some random data for the sparklines; but I think the table shows that graphical tables can in certain cases be superior to tree maps.
Posted by: Andreas Lipphardt | May 24, 2008 at 06:35 AM
Sigh. These confusing typepad "Posted by:" blocks lose me so much credit in Junk Charts threads... :-) :-)
Posted by: derek | May 24, 2008 at 09:01 AM
Out of curiosity, what program was used to create this treemap? (Or was it "hand"-tailored?) It's far more attractive than most I've seen, with the category titles superimposed over the thick-bordered category rectangles rather than placed as "headers" of the rectangles.
Posted by: Joel | May 24, 2008 at 03:23 PM
@Joel I used some custom R code to create the main treemap. I'm happy to share the code, but I am in the process of tidying it up a little bit at the moment. I'll keep you posted...
@Vozome Clients are asset managers.
@derek I'm interested to know how you persuaded Excel to give you crisp bitmaps!
@Andreas You graphical table looks good. What did you use to produce that?
Posted by: Sean Carmody | May 25, 2008 at 10:20 PM
Great! I'll keep watching for the R code! Thanks!
Posted by: Joel | May 25, 2008 at 10:54 PM
@Derek: Sorry for not giving you the credit, the "Posted by:" blocks are really confusing :)
@Sean: I used Excel to create the tables and MicroCharts to add the in-cell sparklines, in-cell bars and axis.
Posted by: Andreas Lipphardt | May 26, 2008 at 01:19 AM
@Andreas: Thanks for the pointer to MicroCharts...looks interesting.
Posted by: Sean Carmody | May 26, 2008 at 11:22 PM
Sean, the trick is to choose a font size less than 8 points, then size the table o match. Below that minimum, Excel abandons the attempt to render the font chosen, and defaults to a bitmap form.
If you do it right, you even get some subtle anti-aliasing gray pixels, as in my table above. But I can't work out how I did it now, all my attempts are producing strict black and white.
Posted by: derek | May 27, 2008 at 02:30 AM
I quite like this implementation of treemaps:
http://marumushi.com/apps/newsmap/newsmap.cfm
Seems to get to the point.
Posted by: tb | May 31, 2008 at 10:09 AM
@tb: love the news treemap
@Andreas: made use of your idea in my latest blogpost.
Posted by: Sean Carmody | Jun 04, 2008 at 09:01 AM
Andreas has now blogged about the graphical tables approach.
Posted by: Sean Carmody | Jun 12, 2008 at 04:46 PM
We further developed the idea of Graphical Tables - An Alternative to Treemaps on blog.xlcubed.com - More Information per Pixel! and come up with this solution
Posted by: Andreas Lipphardt | Jun 13, 2008 at 01:48 AM