Reading an infographic about our climate crisis
Circular areas offer misleading cues of their underlying data

Illustrating differential growth rates

Reader Mirko was concerned about a video published in Germany that shows why the new coronavirus variant is dangerous. He helpfully provided a summary of the transcript:

The South African and the British mutations of the SARS-COV-2 virus are spreading faster than the original virus. On average, one infected person infects more people than before. Researchers believe the new variant is 50 to 70 % more transmissible.

Here are two key moments in the video:


This seems to be saying the original virus (left side) replicates 3 times inside the infected person while the new variant (right side) replicates 19 times. So we have a roughly 6-fold jump in viral replication.


Later in the video, it appears that every replicate of the old virus finds a new victim while the 19 replicates of the new variant land on 13 new people, meaning 6 replicates didn't find a host.

As Mirko pointed out, the visual appears to have run away from the data. (In our Trifecta Checkup, we have a problem with the arrow between the D and the V corners. What the visual is saying is not aligned with what the data are saying.)


It turns out that the scientists have been very confusing when talking about the infectiousness of this new variant. The most quoted line is that the British variant is "50 to 70 percent more transmissible". At first, I thought this is a comment on the famous "R number". Since the R number around December was roughly 1 in the U.K, the new variant might bring the R number up to 1.7.

However, that is not the case. From this article, it appears that being 5o to 70 percent more transmissible means R goes up from 1 to 1.4. R is interpreted as the average number of people infected by one infected person.

Mirko wonders if there is a better way to illustrate this. I'm sure there are many better ways. Here's one I whipped up:


The left side is for the 40% higher R number. Both sides start at the center with 10 infected people. At each time step, if R=1 (right side), each of the 10 people infects 10 others, so the total infections increase by 10 per time step. It's immediately obvious that a 40% higher R is very serious indeed. Starting with 10 infected people, in 10 steps, the total number of infections is almost 1,000, almost 10 times higher than when R is 1.

The lines of the graphs simulate the transmission chains. These are "average" transmission chains since R is an average number.


P.S. [1/29/2021: Added the missing link to the article in which it is reported that 50-70 percent more transmissible implies R increasing by 40%.]




Feed You can follow this conversation by subscribing to the comment feed for this post.

Dave C.

Does your graphic assume an infinite "susceptible" population? What about the nature constraints of a limited population, some portion of which has already developed antibodies?


DC: Good question. I assume R is applied at each time step to the new infections of the previous time step which means I assumed a built-in recovery/death mechanism. Also, I'm not modeling the end state when the susceptible population tapers away. As far as I know, experts have refused to confirm that there is lasting immunity from infection; some are predicting the disease will become endemic.
It is not clear to me what "R number" is as commonly used. R0, which is the R number at the start when the entire population is susceptible, is the key parameter in the SIR model. So I just interpreted it in the most convenient way so I can show the graphical form. If you know more details about how to interpret R over time, please add to these comments.

Dave C.

I have no background in epidemiology, but found this paper helpful:


DC: Keeping that paper in my archive. It doesn't address the R number as used by the U.K. government, which is a time-varying quantity. Conceptually, I think it means at any sample point of time, given the size of infected population, and the susceptibles, how many susceptibles are being infected on average by each infected person. Of course, no one can measure infections (as opposed to cases), nor can anyone know who does or does not have immunity so any estimate of R is a lot of assumptions plus a little, indirect data. (One of the reasons I don't spend much time talking about R on my blogs.) The way I interpreted R, I'm assuming selecting these time steps in such a way that the only relevant information passed between time steps is the number of newly infected in the previous time step. In other words, the people infected two time steps back are no longer contagious, whether this is due to death or recovery while those infected one time step back are still contagious (on average).


It shouldn't matter whether R is actual infections or observed infections provided the ratio between them is constant. If an observed infection produces 2 observed infections should also mean that actual infections produce 2 actual infections.

The paper that discusses the R for the new variant is at with the interesting results at the end of page 11. This was publishedd in early January so there would be more data now. At that time the difference in R had very wide confidence intervals.


Ken: In theory yes. But actual infections are not measurable so the ratio between actual and observed is not known. Besides, the ratio is clearly not constant, and depends on factors like testing strategies, and prevalence of disease. Any computation of R is necessarily a lot of assumptions mixed with a low-quality indirect measure of "cases".

The comments to this entry are closed.