Some colleges and universities have re-opened for in-person classes for the fall term. Previously, I examined the Cornell model (link), which outlines a very aggressive plan for preventing the virus from entering the campus, and then snipping transmission chains during the semester. Also, we learned that several schools that opened in August quickly ran into trouble as hundreds if not thousands of cases surfaced, and some decided to return to online teaching. (link)
I just attended a fantastic webinar by Peter Frazier, who's part of the Cornell team that did the statistical modeling work supporting the re-opening. The webinar was sponsored by Columbia.
It's about three weeks after Cornell reopened for in-person instruction, and the situation in Ithaca is significantly better than on most campuses. The following chart best illustrates why the team is taking a victory lap, although they absolutely realize that there are still risks ahead.
[Applause]
[Note: I expunged a number of Excel defaults from the above chart, including line colors, legend, and date labels. Also, replaced infections with cases, which is a more precise description of the metric.]
As I indicated in the previous post, if Cornell's plan proves successful, it would repudiate what the U.S. government did in the early phase of the pandemic. It shows that extremely aggressive surveillance - and in particular, broad-based testing - will dampen case rate to around zero, and keep it there. That's what other countries like China and South Korea managed to do.
At the end of the talk, Prof. Frazier offered advice to data scientists who want to make real-world impact. I agree with every bullet point and am reiterating the whole list here.
- Focus on solving the real-world problem, rather than what can be published. He discloses that they have not written a paper about the work they did here. This is advice I give to researchers who want to transition into industry positions.
- Create mental models of your computer models. A different way of expressing this point is that the modeler should develop some intuition behind how the models work, even if the methodologies are complex and opaque.
- Communications. He spent a lot of time explaining the model to the public, and the team was open to feedback and continuously listening to the community. People are emotional about the subject, and some of the commentary were harsh but it's part of the job. Frazier's presentation today is impressive evidence of his communications skills.
- Understand the political environment. Different groups have different objectives and biases.
- He calls this "luck." The key decision-makers at Cornell are scientists by training, so they have an easier time persuading them to believe in science. That's an important point when it comes to "choosing your boss." For many data scientists, jobs have been plentiful, and job satisfaction will depend on whether you're fighting with your boss all the time. Imagine the scientists at FDA or CDC who had to deal with political appointees who are reportedly pressuring them to alter their language. Those meetings would be a completely different beast than what the Cornell team had to deal with.
Throughout the presentation, Prof. Frazier pointed to other "luck" factors that favor Cornell's response. For example, Ithaca is very isolated. He said that the nearest towns are an hour's drive away. Also, their veterinary school has abundant expertise in PCR testing, and has been extremely helpful.
He also reported that pessimists were much more vocal than optimists, and so assumptions may have been driven more conservative than warranted. Modelers may also have a conservative bias because better safe than sorry.
***
In his presentation, Prof. Frazier demonstrated the power of statistical modeling. The core simulation model is highly complex, with many moving parts and assumptions, as I discussed in the earlier post. The assumptions are necessary because most of those parameters are not directly observable, and highly uncertain. While the model was not "accurate" in the sense of forecasting precise outcomes, it accurately predicted the general trends. Making assumptions is not a bad thing!
Then, we saw Occam's razor in action. Statisticians have long preached that models should be as simple as possible, but not simpler. The core model is actually built on top of simple epidemiological models, and those models alone were able to capture the general trends.
***
Since the time I reviewed the Cornell plan, there were a couple of important additions that probably contributed to the on-the-ground situation being better than the original projection.
First, the frequency of surveillance testing was doubled for undergraduates from once a week to twice a week. The insight is to tailor the testing rate to the amount of contacts expected in different subgroups. The mathematical models help calibrate what level of testing is required to keep the infection rate below a critical threshold.
Second, to work around the limited capacity for contact tracing, and to take advantage of their expertise in PCR testing (via the vet school), they are testing the entire social circle for each positive case. They call this "adaptive testing". The goal is to reveal as many hidden cases as possible before the virus has a chance to spread further.
Testing is a huge commitment in terms of money, resources and time. Cornell is conducting over 5,000 tests per day - obviously, they are returning results very fast. It appears also that the students are cooperating, and grasping the key issues. This pandemic is a test of community spirit.
***
Perhaps lost in our national conversation is the decisiveness of early, aggressive action. Public-health experts have stressed this point from the start, and the Cornell experience (relative to other universities) shows they were right.
Take compliance to testing as an example. When there isn't much virus in circulation, we can accommodate some degree of non-compliance because the non-compliant person will rarely run into an infectious contact. When the epidemic is in full swing, the chance of interfacing with infectious people is much higher, and non-compliant members are more likely to spread the virus around.
We can't think linearly. Any community is moving toward one of two final states: almost no virus in the community, or almost everyone gets infected at which point the virus also spreads slowly because most contacts are between infected people. Without intervention, the second state is inevitable. Once an epidemic gets out of control, it's very hard to revert to the first state.
Cornell's success to this point is showing us that places like China and South Korea can indeed eliminate the virus with early aggressive actions. These countries don't have to fake the data if they did the right things.
Whether the success will last is an open question. It's only three weeks into the term. Prof. Frazier noted that they strongly discourage students from traveling outside Ithaca. I'm not sure this is possible during breaks and Thanksgiving. Similarly, countries that successfully fought off the first wave still face risks if the virus is re-imported from other countries that failed in their initial response. For Cornell, I'm guessing that they will have to roll out arrival testing after each long break.
Perhaps - or perhaps it just is another reminder that the first wave left us close to herd immunity with much much higher infections than believed particularly amongst young students, a big chunk immune through T-Cells and not showing up in anti-body testing and a substantial part of the population just plain and simply naturally resistant (like a big proportion of school kids so no surprise if a lot of students aren't just the same).
Every analysis that gets through the mainstream press ducks these questions.
Posted by: Michael Droy | 09/23/2020 at 04:26 PM
MD: The Cornell model reaches theoretical "herd immunity" when 75 percent of the community is infected. With broad-based testing, they know the case rate is close to zero. Even at Cornell, in the first week when their testing program wasn't fully operational, they found about a dozen cases a day. At other campuses, where they don't test this much, students are getting infected. The theory about young students could be true; the onus is on the proponents of such theory to prove their case.
Posted by: Kaiser | 09/23/2020 at 04:36 PM