The cross-hairs of religions

Long-time reader Nick B. found this attractive flow chart.


The chart was produced by the Internet Monk blog. The data was culled from this report (PDF) by the Pew Forum.

The cross-hairs trumpet excitement but the reader is left without much. One could tell that the unaffiliated proportion (red) has more than doubled, mostly at the expense of Catholics (green); that most religions retain the vast majority of their faithful (at least by internal proportions); and that people of one or another faith  move to one or another faith.

Yet, any of these high-level insights do not require a chart that contains data on movement between each pair of religions.

One smart thing about this chart is the inclusion of "unaffiliated / no religion", which completes the picture; otherwise, some previously faithful people would drop off the chart (literally).

The other smart thing is its self-sufficiency: none of the data is printed on the chart, and I doubt readers miss them.


Here, I attempt an alternative, which is a variant of the Web of Debt chart discussed here.


Note the economy of colors, lines, etc. I have chosen to use the number of people with a particular childhood faith as the base for all the percentages; other bases can be selected. For example, the unaffiliated has grown by 144% of the childhood base, with about half of that growth coming from previous Protestants; meanwhile, an exodus of Catholics has occurred. (PS. the data for other faiths being incomplete in the aforementioned report, I made up some of the data so as to finish the chart.)

If the line thickness is made proportional to the percentages, that would eliminate the need to have all those numbers on the chart.

Untangling Europe's debt web

A number of blogs have hailed this NYT diagram/chart/infographic as "nice". The accompanying article is here.


Whether this is nice or not depends on what message you want to convey with this graphic. If it is entanglement, then yes, the graphic reveals the complexity very well. If one wants to understand the debt situation in Europe, then no, this chart doesn't make it clear at all.

From the perspective of someone wanting to dissect the debt web, an enhanced data table is hard to beat.


The first section looks at the interdependency between the five troubled countries, collectively known as PIIGS on Wall Street. The additional debt owed to Britain, Germany and France are shown below. Notice that the original chart does not treat these three countries the same way as PIIGS: we do not know what the values are of the arrows pointing from these three into PIIGS.


Expressed on per-capita terms, Ireland stands out as the worst of the bunch while the citizens of the other four countries are bearing roughly equal amounts of debt per person.


I tried to come up with something more "fun", as below:

Redo_eudebt_revHere, I opted to use a small multiples chart to split the countries. In so doing, I accepted redundancy in search of clarity. Each amount is plotted twice once as a borrowing (red line) and once as a lending (black arrow).

It is immediately clear why Greece is the most urgent issue.

Perhaps the chart type is not as important as the transformations I did to the data:

1) All amounts shown are "net" amounts between any pair of countries. In the original data, there are two arrows between each pair. For example, Italy owes Ireland $46 million but Ireland owes Italy $18 million; this means Italy owes Ireland $28 million net.

2) All amounts are expressed per capita. Since the populations of these countries vary from 4.5 million (Ireland) to 60 million (Italy), the total debt cannot and should not be compared to each other.

3) Not shown here but I also expressed the net amount lent/borrowed per dollar of GDP. This is another metric that makes sense. The nominal GDP of these countries range from $0.2 - $2 trillion. The PPP GDP has a similar range.

4) One item I did not fix is the currency. Given the fluctuation in exchange rate between the Euro and US$, I think it may be better to express all the numbers in Euros.

A next step would be to include Britain, Germany and France.

Reference: "In and Out of Each Other's European Wallets", New York Times, April 30, 2010.

PS. Reader JF pointed out an inconsistency in the numbers on the chart. I revised the chart to fix this issue. In the current chart, one can read the information as: the average Portuguese owes Spain $5,453, owes Italy $141, while having lent $903 to Greece and $1,561 to Ireland. Each chart can be interpreted from the perspective of the average citizen in that particular country. (For details, see the comments below.)

Leave good alone

In Cousin misfit, we looked at a problematic area chart in which the areas on the chart contain no useful information. The lines in a line chart should carry some meaning, and so too should areas in an area chart.


The Wall Street Journal recently printed something that looked like a cross between a column chart, an area chart, and a flow chart.  Whatever it is, the areas of the pieces do not match the data.

The data describes how the TV market is split between the top 5 brands (comprising over 50% of the total unit sales) and all other brands -- basically the six numbers printed on the chart.

The graphical construct can be broken up into three parts: a stacked column (on the left), a stacked column with gaps (on the right), and some connecting areas (which are parallelograms).

The last two parts are unnecessary, and in particular, the parallelograms distort the total areas.

It can be baffling to the reader why the left column is shorter than the right column when both show the identical data.

At first, I thought this is some kind of flow chart illustrating the change in market share over time but that's not the case.

What's wrong with the standard stacked column?

Reference: "Samsung Edges out TV Rivals", Wall Street Journal, Feb 17 2010.


Here are some of my favorite links from other places:

GeneticsA spatial journey illustrating a very long scale, created by the Genetic Science Learning Center (here)

Long scales are very difficult to deal with in charts; I have never been satisfied with log scales since it addresses the designer's challenge of trying to fit everything onto one page, bu does not deal with the reader's need to compare the elements accurately

Not sure how this helps but perhaps some of you will figure it out

Movie_narrative_charts_large Tommi left a comment about this conceptual chart by xkcd, which has been making the rounds.  Fits into our Light Entertainment category.

Says there is no optimal chart type.  A type that works very well for one data set may get hopelessly cluttered for another, similar data set.

Unemploystate From fellow bloggers (especially Jorge), a whole series of views of the U.S. unemployment figures by state over time.  Alternatives that are much more interesting to look at than the typically line chart. Jorge even found something in Excel that looks good.

Vanishing act

This is a well-executed chart showing the complex dealings between Wall Street firms in the last 40 years.


They found a way to present all the information without criss-crossing lines.  The right column is the clincher.  It listed all the important recent events.

Reference: "Wall Street: RIP", New York Times, Sep 28 2008.

Flows and partitions

Andrew M., a new but loyal reader, didn't like the flow charts used by the EPA to illustrate cleantech.  We had some lively discussion on flow charts before.  The bottom line seems to be that they are difficult beasts to tame, especially when the relationships are complex.  The example shown by Andrew (below) is not particularly horrid in this scheme of things.  It's the abundance of annotations and colors that cause dizziness.


Here's a view of the same data, using a partitioning approach.  The inputs are fixed at 100 units, which I find easier to comprehend, while the original fixed output at 30 units of electricity and 45 units of heat.  And of course, it is a tremendous service to readers not to have to work out the efficiencies.  Tacitness is a vice, not a virtue, in graph-making.


Reference: "Catalog of CHP Technologies", US EPA Combined Heat and Power Partnership.

Embedding logic

Bernard L. (from France) submitted this bubble chart for consideration.  It accompanied an NYT article claiming the absence of evidence of election fraud.  (Of course, as is well-known, absence of evidence is not the same as evidence of absence.  Here, I'm purely interested in data presentation.)

As a seasoned consultant, Bernard asked if a Marimekko chart would be superior.

Nyt_convictions_2 This is one ambitious chart.  Ignoring the bubbles (which are more nuisance than anything), we are asked to interpret data at three different levels of aggregation in one go.

First, there were 95 cases classified into five indictment types.  Second, these cases resulted in either convictions or acquittals/dismissals.  Third, among the cases ending in convictions (the highlighted area), we were shown the occupations of those convicted.

By flattening three levels into one table, some key information is obscured.  For example, how many cases resulted in conviction?  The reader has to compute either 95-25 or 26+31+10+3.  What percent of civil rights violation convictions were committed by party/campaign workers?  It's not 2/3 = 67% (bottom row) but rather 2/2 = 100%.

The following junkart brings out the logic that is embedded in the complicated bubble-table.  While there is a lot on the page, the text labels plus the flow directions allow readers to absorb the data one level at a time.


I have not attempted the Marimekko as I am not a fan of such charts.  You're welcome to try.

Source: "In 5-Year Effort, Scant Evidence of Voter Fraud", New York Times, April 2007.

PS. I will be working through the backlog of reader submissions.  Thanks for your patience.  Keep them coming!


Remark (Apr 25 2007): Thanks to readers for keeping me honest (see comments below).  The conviction rates shown previously were indeed the inverse.  I have now fixed them.

Graphical equity 3

Zuil provides an alternative rendering of the Sankey diagram / flow chart.  This one is surely superior, being easier to understand while capturing more information than the previous example.

Govt_sankey2_1Ultimately, however, this type of chart will please specialists more than the general reader.

It is designed to be purely descriptive, which explains the absolute equality given to each flow, as indicated by the choice of unique colors and/or patterns for each.

As a data graphic, it can be  improved if the designer has a point to make.  In that situation, only the relevant flows can be highlighted while all others stay in the background.

As it stands, this chart murmurs but does not opine.

Reference: "U.S. Energy Flow - 2002", Energy & Environment Directorate, Lawrence Livermore National Laboratory.

Graphical equity 2

Based on my last post, Zuil and Lope engaged in a lively conversation about "flow charts", apparently also called "Sankey charts" in some circles.  Here is an example Zuil found at the EIA site:Govt_sankey

Zuil commented that

Though often difficult to draw, Sankey diagrams are IMHO unbeatable to represent any type of lossless flow (energy, money, fluids, etc).

I mostly agree: flow charts are great at tracing flows, and it's easy to figure out proportional sources and uses from this example.  Moreover, as Lope suggested, it's fun (to read).

But... the data content of this chart is lower than that of the network graph or the Marimekko.  Imagine removing all the lines (arcs) in the network graph: that is what the flow chart includes.  It achieves more readability by simplification.