Presented without comment

Jun 26, 2020

Weekend assignment - which of these tells the story better?

Or:

The cop-out answer is to say both. If you must pick one, which one?

***

When designing a data visualization as a living product (not static), you'd want a design that adapts as the data change.

You can follow this conversation by subscribing to the comment feed for this post.

I’d have to say the linear plot. The initial instinct from the log plot is that cases are levelling off which is clearly not the case.

If I had to choose between these two I'd go with the linear chart since none of the underlying viral curves follow and exponential function anyways. And I'd rather see new cases on a linear scale than cumulative cases.

PS: I love your blog. Been following it for years, even though it's the first time I ever comment!

Trick question - they are both quite bad. Of the two, the linear one is better since it doesn't hide the rate of change as much as the log chart does. Much better is a chart showing the rate of new cases (or deaths, depending) vs time. I.e. the slope of that linear chart. In addition, the rate should be normalized per capita to have some hope at all of trying to compare curves. The trickiest thing of all is to account for varying per-capita testing rates, and I've not yet seen anyone try to deconvolve this anywhere (not to mention that it would add to already noisy/sketchy data). The best you can do is compare countries with similar data related ability and per-capita testing rates (like Canada and the USA) to have any hope of making sense of things. Even then, there are a lot of factors making direct comparison difficult (like change in testing rates over time, and test reporting methodology). All this data is wonderful, but extremely difficult to extract comparable charts from. There's certainly enough data now to make some interesting/major conclusions, but comparisons over time are still tricky and can be potentially misleading.

I think like Bgc099 (even for plotting new cases instead of total cases), since when the growth is linear it comes naturally to prefer the linear chart.
One point in favour of the logarithmic scale is that in the linear chart it is difficult to discern among two countries, apart the few main ones; that is, it the user cannot zoom in the chart, it is difficult to evaluate the slope of a path. On the other hand, if y=kx, than log(y)=log(k)+log(x), so if two growths are linear, on a logarithmic scale they appear translated of some vertical amount; roughly speaking, it is possible to separate the pandemics evolution (the form of the curve) from the size of the country (the height of the curve).

If the story is about the overall growth worldwide, the log graph would be a good starting place: it doesn't highlight the outliers.

If the story is about the outcome of country policies, well, that linear graph has something very specific to say. But the title of your graphic suggests it should be about much more.

Since the title is "how rapidly are they rising?", I have to ask - which is rising the fastest? Since rising is slope, it looks like the U.S. and Brazil are tied in the linear graph but Brazil is rising faster on the log graph. I believe that's a distortion - but I can't prove it.

The comments to this entry are closed.