« What the Danish study tells us about the CDC study on real-world effectiveness | Main | In science, truth matters »


Feed You can follow this conversation by subscribing to the comment feed for this post.

A Palaz

Hey Kaiser,

So in Danish I don' t think any much regression they just reweight by patient day. E.g. in care home population they vaccinate early leaving only 1868 not vaccinated from maybe 15-17 days.

Here is my guess at raw cases in unvaccinated cover by their time windows.

454. 0-14 days. Etc...

Why do they not present this. Its very annoying. Maybe you have an idea. Possibly a feeling that it will not look robust with the shrunken unvaccinated sample?

Thinking about that how should that be dealt with?

This connects to this post because the link I see across studies is which is better patient days at risk or patients for presenting.

I think days better because of relationship between infections and exposure, but obvious this gets same effect you detail in Simpsons paradox.

So how does this connect here. Well you make a slip in one post saying "hours at risk" and for the CDC this applies.

One more adjustment for these special high risk groups would be hours of exposure to high risk *degree of risk.

So just to add one more to your great list, some of the variance might also be explained WITHIN and across risk groups by such factors.

I now look at all studies that do not present detailed patient days with disappointment and the question why.


AP: I think the level of disclosure in these studies are well below what's required, especially since most of these are "interim" studies where the data have not been baked in yet. The calendar time is made much more crucial because of (a) the use of case-counting windows and (b) the changing environment. Also, I do not understand why they do not publish their model - the CDC study did not publish their model either. Saying it's a Cox regression is not enough.

The CDC study addresses the issue you brought up - that you can't use infection rates per person when more and more of the cohort are getting vaccinated. So these new studies - unlike RCTs - use person-time as the denominator. As I said above, this effectively splits a vaccinated person's timeline into two parts, first counting as unvaccinated and then as vaccinated. This is the so-called Andersen-Gill extension to Cox. It's possible that this is what the Danish study did as well but nothing in the paper tells us that.

But that adjustment does not deal with the sharp decline in infection rate from December to March, and the fact that the unvaccinated exposure is primarily in the earlier weeks when infection rates were much higher.

I also think - and someone please correct me if I'm wrong - that AG extension does not address the self-selection bias problem in this data. All it does is to address a timing bias that would arise if a standard analysis were applied.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.


Post a comment

Comments are moderated, and will not appear until the author has approved them.

Your Information

(Name is required. Email address will not be displayed with the comment.)

Kaiser Fung. Business analytics and data visualization expert. Author and Speaker.
Visit my website. Follow my Twitter. See my articles at Daily Beast, 538, HBR, Wired.

See my Youtube and Flickr.
Numbers Rule Your World:
Amazon - Barnes&Noble

Amazon - Barnes&Noble


  • only in Big Data

Next Events

Jan: 10 NYPL Data Science Careers Talk, New York, NY

Past Events

Aug: 15 NYPL Analytics Resume Review Workshop, New York, NY

Apr: 2 Data Visualization Seminar, Pasadena, CA

Mar: 30 ASA DataFest, New York, NY

See more here


R Fundamentals, Principal Analytics Prep

Numbersense: Statistical Reasoning in Practice, Principal Analytics Prep

Applied Analytics Frameworks & Methods, Columbia

The Art of Data Visualization, NYU

Signed copies at McNally-Jackson, NYC

Excerpts: Numbersense Ch. 1, 7, 8. NRYW

Junk Charts Blog

Link to junkcharts

Graphics design by Amanda Lee