Hoisted from the archives: a revolution
Apr 06, 2010
In October 2007, I wrote about the "canvass" metaphor for graphing software. This was what I said:
With the advent of AJAX and other interactive technologies, one can only hope that new graphing software will use the "canvass" metaphor. If we want to reduce the spacing between bars, we should be able to grab the bars and move them together. If we want to change the ordering, we should be able to mouse over some menu and select a pre-defined ordering scheme, or to drag and move bars around as we please. etc. etc.
To push this metaphor further, this kind of software should facilitate the "exploratory" stage of graph-making. I blogged about this stage of making sketches before. One longs for software that allows one to flip through many different chart types quickly, to settle on the desired type, and then to make the nitty-gritty changes to the axes, colors, dots, etc.
The revolution has arrived in the form of JMP's Graph Builder function. It is not perfect yet, as even the example I use will show, but I'm excited because we are getting closer to that "canvass" metaphor.
I'm going to re-make this inedible pair of donuts from an otherwise quite nice infographics on the growth and nature of spam in the last 10 years. (New Scientist)
I have pointed out the biggest shortcoming of donut charts often: the fact that the most important clue to the size of each sector of the underlying pie chart, that is, the angle at the center of the pie, has been cut off from the chart, and often, as in here, obscured by a number.
There are dramatic shifts in proportions of spam types during the last decade but the effect is underwhelming as depicted.
In the Graph Builder, I can push around the data and create different chart types. First, I made a small-multiples bar chart.
By clicking on the word "Year" and dragging it to a box called "Overlay", I made a paired bar chart:
What about a dot plot instead? This change requires a right click but easy enough:
Here's where I encountered a little inconvenience. It's probably ignorance on my part since I didn't read the manual. I couldn't figure out how to increase the dot size for all dots at once, only one at a time.
In any case, I'm still searching. I want to do a small-multiples line chart. For this, I drag the word "Year" into the bottom of the chart labelled "X", and then right-click to add a line to the dot chart.
This is close to a desired chart type for this data. The change from year to year is highly apparent, and the increased and decreased spam types are also obvious. I would color the increases differently from the decreases if I have the time.
I had a very difficult time (and failed in) getting the year labels to say 1999 and 2009 which are the logical points for this data. JMP seems to have a mind of its own.
Since it takes no time, I experimented some more. By moving "Category" to "Wrap", I reproduced the above chart but in a matrix form:
Finally, I made the "Category" an "overlay" which resulted in this chart. This is kind of like the Bumps chart but obviously a bad idea for this data: (I'm not even showing the really ugly legend).
So, my dream toy -- the "canvass" style graph maker -- is here! It only takes a few minutes to move the data around this canvass, and see these different chart types.
I indicated that this goes a long way but isn't perfect. Right now, sketching and exploring is easy but refining and detailing is not as easy.
What I would like to see: once the general form of the chart is chosen, maybe a second canvass is needed, with Photoshop as a metaphor, in which we can chisel out the nitty-gritty details, like the axis labels, dot sizes, line widths and so on.
Also, the number of chart types can, and I presume will, be increased over time. For instance, I don't think the current version allows a profile chart; it seems to adhere to the overly-rigid rule that a categorical data series should not be connected by a line.
(I should say that in the current release, one way to accomplish this is to save the resulting graph-sketch as a "JMP script" and then go into the code and change things around. But since we are doing point and click, and visual interaction, why not go all the way?)
Most existing graphing software fall into two extremes: the Excel style which is super-rigid, or the R style which allows minute control over every little thing. This, I think, is the third way.
To make a profile chart in JMP, you will need a data table with three columns. First a continuous numeric column containing the data, second a categorical column for the primary categorization, and for the third a categorical column to identify the observations for each series. In this case, the data column would be "Proportion", and categorical columns would be "Year" and "Category". The example data table should have 20 rows. From here, make a Oneway plot using Analyze>>Fit Y by X. Put the data column, "Proportion" in the "Y, Response" box. Put the "Year" column in the "X, Factor" box and hit "OK". After the plot is generated, click the little red triangle next to the graph title to bring up the platform options menu. Click on "Matching column..." which will bring up a small column selection dialog. Here you can select the "Category" column, and the graph will now display colored lines connecting the two observations for each category. This procedure can be extended to other examples with more X categories or continuous X variables, and grouping with "Matching columns..." can allow you to perform analyses like fitting regressions to each individual series rather than the entire data set.
Posted by: Steve Kern | Apr 06, 2010 at 10:03 AM
Hold Ctrl while you right click to apply marker size changes to everything in the window. And you can define every aspect of the axis marking system by right clicking on the axis and choosing "Axis settings." You should be able to set the min, max, increment, etc. You can also just click and drag!
Posted by: Daniel | Apr 06, 2010 at 11:06 AM
From the creator of Graph Builder, many thanks for the kind works and suggestions for refinement. Graph Builder, and JMP in general, is focused on fast interactive exploration, but we're always trying to make improvements in visual quality as well. For the legends, it's possible to clean them up a bit by right-clicking on the legend title and choosing Legend Settings.
Graph Builder does allow line elements even with categorical X values. Despite the rule you refer to, I've come to see that the ability of the line slopes to show variation patterns often outweighs any misleading suggestion of continuity between the categories.
With such lines you can make Profile Charts if the data is in the right form: a single response variable as Y with one categorical variable as X and another categorical variable as Overlay. If you have multiple variables, you can use Tables:Stack to put them in the form Graph Builder wants. Or just use the Parallel Coordinates plot in JMP, especially if the responses have different scales.
The continuous axis is not flexible enough to show only "odd" values like 1999 and 2009, but you can make year categorical ("nominal" in JMP) and get a two level categorical axis with those years.
Posted by: Xan Gregg | Apr 06, 2010 at 09:32 PM
The biggest shortcoming of donut charts is in fact that they make me salivate.
Posted by: Pavlov's data analyst | Apr 07, 2010 at 04:03 AM