Making colors and groups come alive
Aug 05, 2024
In the May 2024 issue of Significance, there is an enlightening article (link, paywall) about a new measure of inflation being adopted by the U.K. government known as HCI (Household Costs Indices). This is expected to replace CPI which is the de facto standard measure used around the world. In Chapter 7 of Numbersense (link), I discuss the construction of the CPI, which critics have alleged is manipulated by public officials to be over-optimistic.
The HCI looks promising as it addresses several weaknesses in the CPI measure. First, it implements accounting for household spending on housing - this has always been a tricky subject, regarding those who own homes rather than rent. Second, it recognizes that the average inflation number, which represents the average price changes on the average basket of goods purchased by the average person, does not reflect the experience of many. The HCI measures are broken down into demographic subgroups, so it's possible to compare the HCI of retirees vs non-retirees, for example.
Then comes this multi-colored bar chart:
***
The chart is servicable: the reader can find the story. For almost all the subgroups listed, the HCI measure comes in higher than the CPI measure (black). For the income deciles, the reader sense that the relationship is not linear, that is to say, inflation does not increase (or decrease) as income. It appears that inflation is highest at both ends of the spectrum, and lowest for those who are in deciles 6 to 8. The only subgroup for whom CPI overestimates inflation is "private renter," which totally makes sense since the CPI index previously did not account for "owner-occupier housing" cost.
This is a chart with 19 bars, and 19 colors. The colors do not encode any data at all, which is a bit wasteful. We can make the colors come alive by encoding subgroup identity. This is what the grouped bar chart looks like:
While this is still messy, this version makes it a bit easier to compare across subgroups. The chart simultaneously plots four different grouping methods: by retired/not, by income deciles, by housing situation and by having children/not. Within each grouping, the segments are mutually exclusive but between the grouping, the segments are overlapping. For example, the same person can be counted in Retired, and having Children, and also some retirees have children while other don't.
***
To better display the interactions between groups and subgroups, I prefer using a dot plot.
This is not a simple dot plot either. It's a grouped dot plot with four levels that correspond to each grouping method. One can see the distribution of HCI values across the subgroups within each grouping, and also compare the range of values from one group to another group.
One side benefit of using the dot plot is to get rid of the non-informative space between values 0 and 20. When using a bar chart, we have to start the bars at zero to avoid distorting the encoding. Not so for a dot plot.
P.S. In the next iteration, I'd consider flipping the axes as that might simplify labeling the subgroups.
why vertical line instead of round dots though? might be confusing, like 10 --- 2 reads like its 10 to 2 decile
Posted by: elle | Aug 18, 2024 at 01:36 PM
Elle: Round dots are fine, too! I think when I made it, I had in mind ranges, which are associated with "brackets", hence those vertical lines.
Posted by: Kaiser | Aug 18, 2024 at 10:17 PM
This needs a total rethink. There is a lack of consistency - you're asked to compare decile 6 with 9 with none. The improved version has colours that don't link to the relevant category.
Posted by: Alex | Sep 01, 2024 at 06:06 PM