« The third worst-kept secret | Main | Blood clots: is there a there »


Feed You can follow this conversation by subscribing to the comment feed for this post.

A Palaz

So I notice some things in this report. They seem to change between a confirmed and unconfirmed group in tables and also change age groups. Do you know why@

On the seroonverted there seem to be 2 sets those that were ecludedcat start band another for the asymptomatic section in the protocol set.

In the asymptomatic set should we add periods together assuming someone positive once remains seropositive or should we treat as separate between 1-29 and 29 ?



AP: Yes, those items are in my notes too but didn't make the cut for the blog.

The unconfirmed cases appeared briefly in the VE Constellation section. I don't accept the excuse that there wasn't enough time to do confirmatory tests on those samples. We are in a global emergency, they are moving things along at "warp speed" and they just couldn't find the time to do 200 "rapid" tests. I pointed out what they also claimed, which is if those cases were treated as if confirmed, their conclusions don't change. But here's the thing: the aggregate number may not change but different cuts of the data surely do change. Otherwise, we'd have to believe that the unconfirmed cases is a perfect random sample of all suspected cases (which I find highly unlikely). That may be why those unconfirmed cases sometimes show up in the tables, and sometimes don't. They are showing the best-case scenario for each analysis. What this practice does is to shift the game from trust the methodology to trust the researcher. I'd add that J&J is not the only pharma that plays this game. For example, Moderna's entire application was based on data in which much fewer than half of the participants have reached the required 2-month median follow-up time - with the promise that a completed but not "confirmed" analysis of the required data would have not changed the finding. That's trust the researchers again.

I did not discuss the asymptomatic analysis because the only trial that even made an honest attempt at that was the Astrazeneca-Oxford. You can't learn anything about asymptomatics when you only test participants once every month. They also did not disclose compliance rates on the testing, nor compare the characteristics of those who did the tests, and those who refused. In addition, the error bars on those estimates are huge to the point of meaningless. They re-defined the base population and the time periods because this entire analysis was not pre-specified in the protocol. The "after Day 29" analysis is an example of why I have criticized the arbitrary choice of case-counting windows. It makes the numbers look better but nothing changed practically except throwing out a bunch of cases - and not changing the denominator.

A Palaz

So I agree the range is big. I am more interested in the validity of looking at the asym data from the view of seroconverted.

I think their definition in sub samples seem reasonable if we consider the time to conversion for IgG meaning we can take as reasonably indicative the VE value for this sub sample? (c and d )?

So also I am not sure why the ranges are so wide. Any ideas? Is it sample size or maybe days at risk? Something other?

I did not see Astra section, will take it a look



AP: Validity - there were 19,600 at risk by Day 28, which means 30% of the participants did not have serological test results. Why? Is the serology risk set representative of all at risk? They didn't tell us.

Day 29 - I think it's unreasonable to define symptomatic cases as after Day 14 and then total cases (including asymptomatics) as after Day 29.

Ranges - it's just a game of numbers. The result they want us to look at is based on <1,400. It appears to have a less ridiculous error bar just because in that case-counting window, the ratio of infection rates shows a far larger difference (went from 15% higher to 3x higher on placebo). Not only is this result based on 1,400 but J&J used a staged enrollment (so did Astrazeneca), always enrolling healthier, younger participants first, so those who have a longer follow-up period are healthier and younger.

The comments to this entry are closed.

Get new posts by email:
Kaiser Fung. Business analytics and data visualization expert. Author and Speaker.
Visit my website. Follow my Twitter. See my articles at Daily Beast, 538, HBR, Wired.

See my Youtube and Flickr.


  • only in Big Data
Numbers Rule Your World:
Amazon - Barnes&Noble

Amazon - Barnes&Noble

Junk Charts Blog

Link to junkcharts

Graphics design by Amanda Lee

Next Events

Jan: 10 NYPL Data Science Careers Talk, New York, NY

Past Events

Aug: 15 NYPL Analytics Resume Review Workshop, New York, NY

Apr: 2 Data Visualization Seminar, Pasadena, CA

Mar: 30 ASA DataFest, New York, NY

See more here

Principal Analytics Prep

Link to Principal Analytics Prep