## Selecting the right analysis plan is the first step to good dataviz

##### Apr 27, 2022

It's a new term, and my friend Ray Vella shared some student projects from his NYU class on infographics. There's always something to learn from these projects.

The starting point is a chart published in the Economist a few years ago.

This is a challenging chart to read. To save you the time, the following key points are pertinent:

a) income inequality is measured by the disparity between regional averages

b) the incomes are given in a double index, a relative measure. For each country and year combination, the average national GDP is set to 100. A value of 150 means the richest region of Spain has an average income that is 50% higher than Spain's national average in the year 2015.

The original chart - as well as most of the student work - is based on a specific analysis plan. The difference in the index values between the richest and poorest regions is used as a measure of the degree of income inequality, and the change in the difference in the index values over time, as a measure of change in the degree of income inequality over time. That's as big a mouthful as the bag of words sounds.

This analysis plan can be summarized as:

1) all incomes -> relative indices, at each region-year combination
2) inequality = rich - poor region gap, at each region-year combination
3) inequality over time = inequality in 2015 - inequality in 2000, for each country
4) country difference = inequality in country A - inequality in country B, for each year

***

One student, J. Harrington, looks at the data through an alternative lens that brings clarity to the underlying data. Harrington starts with change in income within the richest regions (then the poorest regions), so that a worsening income inequality should imply that the richest region is growing incomes at a faster clip than the poorest region.

This alternative analysis plan can be summarized as:
1) change in income over time for richest regions for each country
2) change in income over time for poorest regions for each country
3) inequality = change in income over time: rich - poor, for each country

The restructuring of the analysis plan makes a big difference!

Here is one way to show this alternative analysis:

The underlying data have not changed but the reader's experience is transformed.

You can follow this conversation by subscribing to the comment feed for this post.

I'd like the per capita GDP scale to be logarithmic, so equal percentage changes are equal in length.

If the data exists, then showing the median and quartiles instead of the mean would not overwhelm the reader, I think. The colors could be eliminated to calm the chart down, rather than add two more colors for the quartiles. Or keep colors but have them be a visually calm sequence from poorest to richest, again without overwhelming detail.

Without colors, an arbitrary number of quantiles could be displayed.

Derek: Good points. Using percent change is better than currency values. My preferred metric here is share of total.

Thanks, great write up! Are the initial data accessible somewhere? I would love to try on my own to come up with a nice design.

For France, Mayotte is probably an outlier, which gives a strange impression of enormous regional inequality in an otherwise extremely centralized and unitarian nation-state.

Marc: I took it from the Eurostat website. This link seems to be a good starting point.

Thanks for the link, Kaiser! I will check the Eurostat site.

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.