Deliberately obstructing chart elements as a plot point
Feb 14, 2025
These "ridge plots" have become quite popular in recent times. The following example, from this BBC report (link), shows the change in global air temperatures over time.
***
This chart is in reality a panel of probability density plots, one for each year of the dataset. The years are arranged with the oldest at the top and the most recent at the bottom. You take those plots and squeeze every ounce of the space out, so that each chart overlaps massively with the ones above it.
The plot at the bottom is the only one that can be seen unobstructed.
Overplotting chart elements, deliberately obstructing them, doesn't sound useful. Is there something gained for what's lost?
***
The appeal of the ridge plot is the metaphor of ridges, or crests if you see ocean waves. What do these features signify?
The legend at the bottom of the chart gives a hint.
The main metric used to describe global warming is the amount of excess temperature, defined as the temperature relative to a historical average, set as the average temperature during the pre-industrial age. In recent years, the average global temperature is about 1.5 degrees Celsius above the reference level.
One might think that the higher the peak in a given plot, the higher the excess temperature. Not so. The heights of those peaks do not indicate temperatures.
What's the scale of the vertical axis? The labels suggest years, but that's a distractor also. If we consider the panel of non-overlapping probability density charts, the vertical axis should show probability density. In such a panel, the year labels should go to the titles of individual plots. On the ridge plot, the density axes are sacrificed, while the year labels are shifted to the vertical axis.
Admittedly, probability density is not an intuitive concept, so not much is lost by its omission.
The legend appears to suggest that the vertical scale is expressed in number of days so that in any given year, the peak of the curve occurs where the most likely excess temperature is found. But the amount of excess is read from the horizontal axis, not the vertical axis - it is encoded as a displacement in location horizontally away from the historical average. In other words, the height of the peak still doesn't correlate with the magnitude of the excess temperature.
The following set of probability density curves (with made-up data) each has the same average excess temperature of 1.5 degrees. Going from top to bottom, the variability of the excess temperatures increases. The height of the peak decreases accordingly because in a density plot, we require the total area under the curve to be fixed. Thus, the higher the peak, the lower the daily variability of the excess temperature.
A problem with this ridge plot is that it draws our attention to the heights of the peaks, which provide information about a secondary metric.
If we want to find the story that the amount of excess temperature has been increasing over time, we would have to trace a curve through the ridges, which strangely enough is a line that moves top to bottom, initially somewhat vertically, then moving sideways to the right. In a more conventional chart, the line that shows growth over time moves from bottom left to top right.
***
The BBC article (link) features several charts. The first one shows how the average excess temperature trends year to year. This is a simple column chart. By supplementing the column chart with the ridge plot, I assume that the designer wants to tell readers that the average annual excess temperature masks daily variability. Therefore, each annual average has been disaggregated into 366 daily averages.
In the column chart, the annual average is compared to the historical average of 50 years. In the ridge plot, the daily average is compared to ... the same historical average of 50 years. That's what the reference line labeled pre-industrial average is saying to me.
It makes more sense to compare the 366 daily averages to 366 daily averages from those 50 years.
But now I've ruined the dataviz because in each probability density plot, there are 366 different reference points. But not really. We just have to think a little more abstractly. These 366 different temperatures are all mapped to the number zero, after adjustment. Thus, they all coincide at the same location on the horizontal axis.
(It's possible that they actually used 366 daily averages as references to construct the ridge plot. I'm guessing not but feel free to comment if you know how these values are computed.)