## Graphical equity 1

##### Oct 02, 2006

I've been slow checking my email lately: several of you have pointed me to interesting charts; I will work through them over the next week or so.  This post is inspired by John S. who forwarded two charts, illustrating where the U.S. gets its energy and how the U.S. uses its energy.

The first visualization, created by Energy Information Administration, emphasizes the physical connections between energy sources and energy use sectors.  This construct is known as a "network graph", and widely used by engineers; the ovals/rectangles are called "nodes", the lines "arcs".  It functions well as a map visualizing physical relationships but it fails as a vessel for data.  Problems are multiple:

• The web of arcs is messy and gets worse with more nodes
• Here, each node has either an input or an output but not both, keeping it simple.  If a node is allowed to take both input and output (the so-called transhipment node), then the graph gets messier
• Arcs converging at a node leave little space for data labels

Next, the Skeptical Optimist blog recast the data onto a construct known to "Marimekko" to management consultants.  Deconstructed, these are column charts,  such that the width of each column represents the relative size of each energy source.

This one does a fairly effective job showing most of our transportation needs are met with oil, our electricity needs are met with coal, our energy sources are roughly split between oil, gas and coal, and so on.

One weakness of Marimekko is "inequity": by its origin as a column chart, it elevates one variable over the other.  What's the relative size of energy used by the industrial sector (blue)?  That's not a question easily answered by this chart.  Even when the column segments are adjoining, as in the case of electricity use (yellow), it is very taxing to size up the yellow area relative to the total area.

So it is that we seek a graph that treats the two variables (source, use sector) equitably.  More later.

Update: Jon posted a response here, and points to a tutorial for creating Marimekko type charts.

You can follow this conversation by subscribing to the comment feed for this post.

Are you classifying the column chart above as "junk"? If so, what are you proposing as a superior alternative?

You can overcome the stated shortcoming of Marimekko charts by adding a "Total" stack to the right of the main chart, as shown on this page:

http://peltiertech.com/Excel/Commentary/GraphicalEquity1.html

The "statistical" name for Marimekko plots are mosaic plots. You may be interested in a close relative, the fluctuation diagram, which doesn't have the implicit conditionining of the mosaic plot.

Mondrian has nice interactive implementations of both of these.

The Energy Information Agency often uses energy flow diagrams that I find a much more compelling way to convey this type of information. C.f. http://www.eia.doe.gov/emeu/aer/pdf/pages/sec8_3.pdf

Steve: I like the Marimekko better than the network graph chart but at the same time, the Marimekko has its problems. I'm hoping readers like Jon would come up with ideas. Stay tuned.

Hadley: I've seen mosaic plots for visualizing categorical variables, and the marimekko is indeed an extension of those to continuous variables.

Zuil: Unfortunately with the energy flow diagram you linked to, it doesn't reveal what percentage of all nuclear energy is used by residential, commercial, industrial etc. Same for petroleum, coal, renewable etc. Still, it conveys a lot of information in a very effective way.

Lope: I linked to the “Flow” or “Sankey” diagram as an example of the approach, not for its data. I do agree that is would be more enlightening if the data were represented as percentages, though that could be done very easily.

I wonder why the EIA used the chart that Keiser linked to, rather that using the Sankey diagram it typically does… Perhaps it was due to a misguided belief that Sankey diagrams are too “technical” for a general audience… If so, I very much disagree. Though often difficult to draw, Sankey diagrams are IMHO unbeatable to represent any type of lossless flow (energy, money, fluids, etc).

"I wonder why the EIA used the chart that Keiser linked to, rather that using the Sankey diagram it typically does?"

Size constraints maybe? If you go to this link http://www.eia.doe.gov/overview_hd.html they did use it, but it's very difficult to make out the text. Maybe Sankey diagrams need to have a lot of space to be effective.

I agree with you, I prefer the Sankey diagram. It's more fun, draws me in.

Oh, I wasn't paying attention to the chart - it's not a mosaic plot, but a stacked spinogram. But please stop calling it Marimekko - this is Marimekko.

The comments to this entry are closed.