Think twice before you spiral

After Nathan at FlowingData sang praises of the following chart, a debate ensued on Twitter as others dislike it.


The chart was printed in an opinion column in the New York Times (link).

I have found few uses for spiral charts, and this example has not changed my mind.

The canonical time-series chart is like this:




The area chart takes no effort to understand. We can see when the peaks occurred. We notice that the current surge is already double the last peak seen a year ago.

It's instructive to trace how one gets from the simple area chart to the spiral chart.


Step 1 is to center the area on the zero baseline, instead of having the zero baseline as the baseline. While this technique frequently makes for a more pleasant visual (because of our preference for symmetry), it actually makes it harder to see the trend over time. Effectively, any change is split in half, which is why the envelope of the area is less sharp.


In Step 2, I massively compress the vertical scale. That's because when you plot a spiral, you are forced to fit each cycle of data into a much shorter range. Such compression causes the year on year doubling of cases to appear less dramatic. (Actually, the aspect ratio is devastated because while the vertical scale is hugely compressed, the horizontal scale is dramatically stretched out due to the curled up design)


Step 3 may elude your attention. If you simply curl up the compressed, centered area chart, you don't get the spiral chart. The key is to ask about the radius of the spiral. As best I can tell, the radius has no meaning; it is gradually increased so that each year of data has its own "orbit". What would the change in radius translate to on our non-circular chart? It should mean that the center of the area is gradually lifted away from the zero line. On the right chart, I mimic this effect (I only measured the change in radius every 3 months so the change is more angular than displayed in the spiral chart.) The problem I have with this Step is that it serves no purpose, while it complicates cognition,

In Step 4, just curl up the object into a ball based on aligning months of the year.


This is the point when I realized I missed a Step 2B. I carefully aligned the scales of both charts so that the 150K cases shown in the legend on the right have the same vertical representation as on the left. This exposes a severe horizontal rescaling. The length of the horizontal axis on the left chart is many times smaller than the circumference of the spiral! That's why earlier, I said one of the biggest feature of this spiral chart is that it imposes a dubious aspect ratio, that is extremely wide and extremely short.

As usual, think twice before you spiral.



Visual design is hard, brought to you by NYC subway

This poster showed up in a NY subway train recently.


Visual design is hard!

What is the message? The intention is, of course, to say Rootine is better than others. (That's the Q corner, if you're following the Trifecta Checkup.)

What is the visual telling us (V corner)? It says Rootine is yellow while Others are purple. What do these color mean? There is no legend to help decipher it. And yellow-purple doesn't have a canonical interpretation (unlike say, red-green). In theory, purple can be better than yellow.

The other mystery is the black dot on the fifth item. (This is the NYC subway so the poster could have been vandalized.) It could mean "diet + lifestyle analyzed" is a unique feature of Rootine, not available on any other platform. That implies purple to mean available but not as effective, which significantly lessnes the impact of the chart.


Finally, let's imagine the data that may exist to support this chart.

The aggregation of all competitors to "Others" imposes a major challenge. If yellow means yes, and purple means no, we'd expect few if any purple dots because across all competitors, there is a good chance that at least one of them has a particular feature.

Next, I'm dubious about the claim of "precision dosed, unique to you". I'm imagining they are selling some kind of medicine or health food, which can be "dosed". Predictive modelers like to market their models as "personalized," unique to each person but such a thing is impractical. Before you start using their products, they have no data on you, or your response to those products. How could the recommendation be "precision dosed, unique to you"?

Even if you've used the product for a while, it will be tough to achieve a good level of optimality with so little data. In fact, given that your past data are used to generate actions intended to improve your health - that is to say, to cause the future data to diverge from the past data, how do you know that any change you observe next period is caused by the actions you took? The pre-post difference is both affected by temporal shifts and the actions you've taken. If the next period's metric improves, you may want to believe that the actions worked. If the next period's metric declines, are you willing to conclude that the actions you took backfired?

"Formulas improve with you". This makes me more worried than relieved.


Problems like these can be solved by showing our work to others. Sometimes, we're too immersed in our own world we don't see we have left off key information.



Start at zero, or start at wherever

Andrew's post about start-at-zero helps me refine my own thinking on this evergreen topic.

The specific example he gave is this one:


The dataset is a numeric variable (y) with values over time (x). The minimum numeric value is around 3 and the range of values is from around 3 to just above 20. His advice is "If zero is in the neighborhood, invite it in". (Link)

The rule, as usual, sounds simpler than it really is. In the discussion, Andrew highlights several considerations.

Is zero a meaningful reference value? In his example, we assume it is and so we invite zero in. But, as Andrew also says, if zero is meaningless, then recall the invitation. So context must be accounted for.

In Chapter 1 of Numbersense (link), I looked at some SAT score data of applicants to competitive colleges. Is zero a meaningful reference value for SAT scores? Someone might argue yes, since it is the theoretical minimum score that anyone could get from the test. Any statistician will likely say no, since a competitive college will have never seen an applicant submitting a score of zero, or anywhere close to zero. Thus, starting such a chart at zero inserts a lot of whitespace and draws attention to a useless insight - how far above the theoretical worst performer is someone's score.


What about the left panel of Andrew's chart makes us uncomfortable? I ask myself this question. My answer is that the horizontal axis highlights an arbitrary value that distracts from the key patterns of the data.

As shown below, the arbitrary value is ~2.5. This is utterly meaningless.


What if 0 is also a meaningless value for this dataset? I'd recommend "bench the axis". Like this:


An axis is a tool to help readers understand a chart. If it isn't serving a function, an axis doesn't need to be there. When I choose a line chart for time-series data, I'm drawing attention to temporal change in the numeric values, or the range of values. I'm not saying something about the values relative to some reference number.

From this example, we also see that the horizontal axis should not be regarded as a hanger for time labels. Time labels can exist by themselves.



Getting to first before going to second

Happy holidays to all my readers! A special shutout to those who've been around for over 15 years.


The following enhanced data table appeared in Significance magazine (August 2021) under an article titled "Winning an election, not a popularity contest" (link, paywalled)

Sig_electoralcollege-smIt's surprising hard to read and there are many reasons contributing to this.

First is the antiquated style guide of academic journals, in which they turn legends into text, and insert the text into a caption. This is one of the worst journalistic practices that continue to be followed.

The table shows 50 states plus District of Columbia. The authors are interested in the extreme case in which a hypothetical U.S. presidential candidate wins the electoral college with the lowest possible popular vote margin. If you've been following U.S. presidential politics, you'd know that the electoral college effectively deflates the value of big-city votes so that the electoral vote margin can be a lot larger than the popular vote margin.

The two sub-tables show two different scenarios: Scenario A is a configuration computed by NPR in one of their reports. Scenario B is a configuration created by the authors (Leinwand, et. al.).

The table cells are given one of four colors: green = needed in the winning configuration; white = not needed; yellow = state needed in Scenario B but not in Scenario A; grey = state needed in Scenario A but not in Scenario B.


The second problem is that the above description of the color legend is not quite correct. Green, it turns out, is only correctly explained for Scenario A. Green for Scenario B encodes those states that are needed for the candidate to win the electoral college in Scenario B minus those states that are needed in Scenario B but not in Scenario A (shown in yellow). There is a similar problem with interpreting the white color in the table for Scenario B.

To fix this problem, start with the Q corner of the Trifecta Checkup.


The designer wants to convey an interlocking pair of insights: the winning configuration of states for each of the two scenarios; and the difference between those two configurations.

The problem with the current design is that it elevates the second insight over the first. However, the second insight is a derivative of the first so it's hard to get to the second spot without reaching the first.

The following revision addresses this problem:


[12/30/2021: Replaced chart and corrected the blue arrow for NJ.]