top | item 22585148

Please poke holes in my Covid-19 math (SF Bay Area)

58 points| drodio | 6 years ago |drodio.com | reply

78 comments

order
[+] kragen|6 years ago|reply
The biggest hole is that the doubling time is usually closer to 7 days (10% growth in cases per day) than to 3 days (41% growth in cases per day). We do see 41% daily growth in reported cases in times where the testing is catching up to a much larger population of undetected cases, but overall 7 days is a more reasonable doubling time. On that assumption, 2040 × 2⁴·⁴ = 43068 actual cases in the Bay Area in a month.

When you're modeling exponential growth, most of your possible errors are just an additive time shift, a small one if the growth is rapid. For example, if the actual current cases are 4080, a 100% error in the estimate above of 2040, that just moves the time to reach those 43068 cases from 31 days away to 24 days away. But an error in the exponential growth rate is a multiplicative time distortion, leading to an exponentially large error at any given point in the future.

[+] Jamesbeam|6 years ago|reply
I heavily disagree on the doubling time being closer to 7 days.

If you have a look at Germany who is having a very good overview of the spread of the pandemic since testing is free, available, and done rapidly (5h turnaround time ) capacity for 12.000 tests a day ) you get a close to 3 day case doubling time currently if you look at the official German numbers from the Robert Koch Institute. And we'll start seeing in a week or two how good the drastic containment measures they are starting now are impacting the growth or viral spread.

11.03.2020 1,288.00

12.03.2020 1,567.00

13.03.2020 2,369.00

14.03.2020 3,062.00

15.03.2020 3,795.00

16.03.2020 4,838.00

In a country like the US where the Author makes his calculations the government frankly has zero clue how many people are infected already because testing is not free, not available and not rapid ( current turnaround 3 -5 days ) for most citizens. There is a chance that the case doubling time is worse, or bigger, but if Germany who had a lot better reaction to the viral threat is having problems reaching a case doubling time of 6+ days and if you think about that China is at 80.000 cases now but put the most drastic containment measures in at only 600 confirmed cases there is a huge probability we are all in for a very wild ride.

That said every day counts because there are promising randomized control studies going on in China which hopefully give doctors access to a drug they can actually help people with. Right now all they can do is support your breathing and pump you full with a cocktail of meds they have no idea if it will work for the specific patient or not. Everyone who lands into ICU with Covid-19 is basically a guinea pig for the rest of us healthy people at this point.

In the end all that matters right now is that we don't know enough about the illness to treat it effectively, and we are in desperate need of time because no government was prepared for a pathogen like this that usually is a once in a century event.

P.S: This is a current snapshot. If the containment measures like in China are working the case doubling rate rises quite fast. China is at over 21+ days now. But I think it's important to take this day by day and re-evaluate the numbers. The "doomsdays predictions" will most likely be wrong and just a theoretical numbers game.

[+] wcoenen|6 years ago|reply
> closer to 7 days (10% growth in cases per day) than to 3 days (41% growth in cases per day).

A doubling time of 3 days corresponds to 26% growth per day (not 41%), because 1.26^3 ≈ 2.

[+] svara|6 years ago|reply
Could you expand on your observation regarding the true growth rate? I don't think that's accurate. Italy's outbreak is entering its fourth week, they have a very constant daily factor of ~1.33. Most countries are on the same trajectory. The only ones to have slowed down are China, South Korea and possibly Iran, though I really doubt the Iran numbers.

You can see that here, all the others are linear in a log plot: https://studylib.net/coronavirus-growth

[+] Spare_account|6 years ago|reply
>The Bay area currently has 204 confirmed cases (as of 3/15). Multiplying that by either 10x or 50x (Harvard's estimated ratio of confirmed:unconfirmed cases) to get the actual number of confirmed cases today: 2,040 - 10,200 actual current cases in the Bay area.

>...

>80% of those will be "mild" which means "possibly as bad as having pneumonia but not needing a hostpital stay." 14% will require a hospital bed.

I have an issue with extrapolating the number of serious cases this way. I assume the 80% figure for non-serious cases is based on the number of formally tested patients.

Therefore the 80% figure doesn't include all the community-spread undiagnosed cases that must also be non-serious (or they would have been in hospital).

The percentage of serious cases amongst the true number of cases is presumably much smaller than 20%?

[+] YZF|6 years ago|reply
It's the nature of all these figures that you can calculate them relatively easily for known cases but are somewhat more illusive for all infections.

That said I think you can put an upper bound on the number of undiagnosed cases out there by looking at deaths and hospitalizations (in places where that data is reasonably accurate).

There's at least a few ways of getting ballpark figures:

- https://cmmid.github.io/topics/covid19/severity/diamond_crui... (EDIT: this gives some estimates by age correcting the data from the diamond princess cruise which I think is one of the best data sets we have since most passengers got quarantined/tested/monitored).

- looking at data from countries with reasonably strong testing regimes and responses. E.g. Israel or Canada are two I'm following. In Canada 12% of the cases so far required hospitalization (EDIT: https://torontosun.com/news/provincial/ontario-sees-spike-of... )

- Surveillance test data for countries that do that.

20% does sound on the high side from what I've seen read. My mental ballpark is 5%-10%. I'm sure there are local variations (smokers, air pollution, age distribution, general health...)

EDIT: The US has 62 reported deaths. Just estimating based on 0.5% ifr that gives 12,400 cases. Then you have to back-adjust this. Let's say by 2 weeks? That's maybe on the order of 100k+. Obviously this will have a very large error bar but it gives us some intuition. Feel free to poke holes at my math ;)

[+] asdfasgasdgasdg|6 years ago|reply
There is also the issue of assuming that social distancing only reduces the number of cases by 90%. Social distancing affects the exponent. Depending on how aggressive it is it could reduce the height of the peak by a lot more than 90%.

I also have no idea the basis for the 10-50x assumption r.e. confirmed/unconfirmed cases.

[+] mokus|6 years ago|reply
Why start today? There were 5 cases confirmed in the US as of Jan 25, 50 days ago, at least some of which appeared to be community transmission. Applying this math for the US as a whole with the 10x initial factor gives 5 * 10 * 2^(50/3) ~= 5.2 million cases TODAY. If we don’t believe that number (or do we? Hell, I have no idea at this point), why do we believe the same math with today as initial conditions?
[+] jessriedel|6 years ago|reply
I don't think the OP math is sensible. (At the absolute most basic, if you're trying to be "conservative" about modeling the effects of social distancing on exponential growth, you should cut the growth rate rather than cutting the absolute number by some percentage.) However, that said: it makes more sense to model growth as exponential once you've documented community transmission as the primary driver. When the US had 5 cases, they were all imported from China and basically contained.
[+] creato|6 years ago|reply
> Multiply the number above by 1,024 (assumes a doubling of cases every 3 days) if you assume no social distancing measure are put in place: 2.1MM on the low end, mulitplying by a 10x ratio. (Just because if I multiply by a 50x ratio, it returns a number larger than the 7.8MM total residents in the Bay area.)

This is why the spread of a virus like this is a logistic curve, not an exponential. An exponential is a good model at first, but extrapolating too far doesn't work like this.

[+] gbrown|6 years ago|reply
The reality is even more complicated - sure, a logistic curve is a reasonable fit if we assume that the data is accurate and the population homogeneous and well mixed, but in practice things get more complicated quickly. There's a huge network of potential contacts, with a large degree of heterogeneity and important subcomponents at multiple geographic scales. In addition, the network is dynamic, not static, so that both highly granular and population scale phenomena can change the contact distribution. Two important questions are:

1. What can we actually measure with confidence to relate to a model of interest 2. At what level of sophistication is it feasible to actually model things

Neither question is particularly straightforward in the face of ongoing epidemics.

[+] hef19898|6 years ago|reply
Finally! I have been almost killed for calling this out. It's like some people believe epidemiologists haven't figured that out yet. And it does make a huge difference, whether the underlying function has a natural peak or not.
[+] andreyk|6 years ago|reply
"Multiply the number above by 1,024 (assumes a doubling of cases every 3 days) if you assume no social distancing measure are put in place: 2.1MM on the low end, mulitplying by a 10x ratio. "

Poke -- lots of measures are already being put into place (Stanford is effectively shut down, the big tech companies are making everyone work from home) etc.

Edit oh you have this right after: "Or, multiply it by less if you want to take into account various amounts of social distancing we're all doing. I'll cut that number above down by a huge amount – 90% – on the assumption that we all learn to stay home and self-isolate immediately, just to be super aggressive on my assumptions about how humanity will rise to the occasion"

Still, kinda weird to start with assumption of no measures when there clearly are some already.

[+] cperciva|6 years ago|reply
The 90% might be conservative. If the rate of spread can be reduced by 50%, it will reduce the number of cases after 30 days by 96%.
[+] sneak|6 years ago|reply
All the bars and restaurants are still open, and people are still allowed to visit friends and family. Most nCoV transmission is in small groups with close contact, not stadiums. Closing offices and schools is a good first step, but as long as bars and restaurants are still open and food delivery services are carrying it all over town, it’s a drop in a bucket.
[+] lkbm|6 years ago|reply
> "Or, multiply it by less if you want to take into account various amounts of social distancing we're all doing. I'll cut that number above down by a huge amount – 90% – on the assumption that we all learn to stay home and self-isolate immediately, just to be super aggressive on my assumptions about how humanity will rise to the occasion"

Dropping 90% is not a great way to calculate this. Look at the math for 61 days out (from March 8): https://youtu.be/Kas0tIxDvrg?t=465

I don't think our social distancing and hand washing will be near enough to prevent millions of deaths, but it's important to recognize that a small change in the growth rate does make a huge difference.

Estimating how much we can affect the growth rate is hard, but I think models should be based on a iffy guess on that value rather than an iffy guess on the resulting count. (Expert models certainly are.)

[+] freepor|6 years ago|reply
Yep - the Bay Area actually has a great chance at curbing this because a larger fraction of workers than anywhere else can effectively work from home.
[+] dnautics|6 years ago|reply
The exponential growth-to-the-whole-population makes a major assumption: everyone is susceptible. Given the virulence of COVID-19, and the surprising lack of total population penetration in Hubei (now that we know just how late the quarantining efforts were given the apparent latency between carrier status and symptoms) one wonders if there isn't, say, a genetic factor which makes one more susceptible than others, for starters, and if COVID-19 simply has/will burn itself out in many places.
[+] hef19898|6 years ago|reply
Plus, everyone who got at the beginning, and went through it, will be immune for a certain time. So these interactions have to be taken into account as well. Plus everyone being immune from the beginning. So no, no exponential function. But as the math guy said, "indistinguishable from an exponential in the beginning".

Edit: The math guy being 3Blue1Brown in his video linked elsewhere in this thread.

[+] codingslave|6 years ago|reply
I personally believe that there is a genetic susceptibility to the virus, but unfortunately, any kind of academic study regarding this is mostly suppressed/shamed. But if you go back an read studies of SARS, many will straight out say that specific genetic pools within China were more susceptible than others.
[+] lurquer|6 years ago|reply
Back of the envelope math... 30,000 cases a few weeks ago in China... use the articles math... (calculating) ... everyone from here to Alpha Centauri is infected and in ICU today.

That may be correct... this could all be a fever-induced hallucination. It's more likely, though, that the articles assumptions are idiotic.

[+] olivierduval|6 years ago|reply
Something is missing: the average duration of covid patient in hospital. Hopefully, it is less than 30 days so some patient will go to hospital, stay a few days (then die or go home) and then leave their room in hospital...

I don't think that it was accounted for in the calculation for number of beds in hospital

[+] coldcode|6 years ago|reply
If you wind up in the ICU, it may be 2-4 weeks before you can be released, regular hospital beds don't count. You also have to assume how many assisted breathing systems (respirators/ventilators etc) are available. You can't simply assume the number of beds are available either since people may already have some other serious issue and be in one. In China they constructed massive hospitals out of large spaces on the fly, something I can't see happening in our for profit system, so the capacity is not likely to be very flexible at expanding quickly enough.
[+] earthtourist|6 years ago|reply
Another very rough way to estimate this: If COVID-19 requires 10x more hospitalizations than influenza, and peak influenza maxes out hospital resources, then we need at least 10x more capacity.

Because there are so many variables, no one knows yet what the numbers will work out to be. The only safe thing to do is to expand capacity as much as possible by taking extreme measures.

[+] chinathrow|6 years ago|reply
This is the right point of view. If you start building hospitals now, you might have a chance when the wave hits. By now, such a post will not get some downvotes as reality trickled in. Have a look at the state of capacity in Italy: it's closed to maxed out with lots of improvised wards.
[+] ajross|6 years ago|reply
Peak seasonal flu in a bad season might infect 5% of a population at once. Most people don't get any single flu strain due to pre-existing immunities and the resulting slow spread of the virus. COVID-19 can be vastly higher, no one has any immunity.

The only real hope here is to prevent infections via quarantine measures, there is absolutely no way to build out the kind of health care capacity that would be needed in a pessimal outbreak.

[+] mcguire|6 years ago|reply
"The Bay area currently has 204 confirmed cases (as of 3/15). Multiplying that by either 10x or 50x (Harvard's estimated ratio of confirmed:unconfirmed cases) to get the actual number of confirmed cases today: 2,040 - 10,200 actual current cases in the Bay area."

Er, ah, uh, .... Multiply 204 cases by an estimated ratio of confirmed to unconfirmed cases of 10 to 50 gives 2040 - 10,200 estimated current cases in the Bay Area.

"Just because if I multiply by a 50x ratio, it returns a number larger than the 7.8MM total residents in the Bay area."

As you just discovered, a raw exponential increase cannot continue for very long, if only because infected individuals begin to have difficulty finding uninfected individuals to infect.

"80% of those will be "mild" which means "possibly as bad as having pneumonia but not needing a hostpital stay." 14% will require a hospital bed."

You may want to knock down the severity numbers a bit, since it seems likely that severe cases will be reported and confirmed in larger numbers than less severe cases.

"Potentially, even assuming aggressive social distancing, the Bay area needs 11x more beds than it has available in the next 30 days."

You probably should do this calculation in terms of bed-days: how long a given case occupies a bed on average before release/death.

Finally, as far as I know, the severity for old and infirm cases is much, much higher than the severity for those of you who are young and healthy. You may want to modify your model for local demographics.

[+] xhkkffbf|6 years ago|reply
The key thing is to avoid counting cases. Those are affected by the number of test kits and the test kits are just now becoming available. Of course the "cases" will soar.

It's more important to look at ICU bodies and deaths. Those aren't as affected by supply issues.

[+] dchichkov|6 years ago|reply
The dynamic in the Bay Area might be different from that in Europe, because of no public transportation.
[+] pengaru|6 years ago|reply
I agree in general for US vs. EU, here far more travel is done via automobiles.

But the Bay Area does have MUNI and BART, it's not a no public transportation situation.

[+] anewguy9000|6 years ago|reply
maybe im getting old, but this reads like very well-written marketing for drodio.com.

the long-winded introduction, use of caps and sense of urgency it piggybacks on top of and the obvious straw man. i mean if you don't know how to do the math, then following the steps as presented in the article (ie. "add this, then times it by that.." etc) you're not actually reaching your own conclusions; that's just hand waving credence to the conclusion that the reader cannot fairly evaluate.

EDIT: writing my originally short comment i noticed many other subtle cues as well. for example:

the author's prediction of the article going viral (establishing trust because they've been "right" before)

the open invitation to criticize the article (suggesting that what you're reading is essentially the result of consensus)

the claim that it's being written despite any criticism it might receive because the subject is so important (moral authority)

all of these points are presupposed by the author, and not actually derived from the work. the more i think about it the more subtle and manipulative it appears. i hope im wrong, poke holes in it! but given the subject we're discussing it's really rather unsettling. we are indeed in a crisis, drodio be damned.

[+] tunesmith|6 years ago|reply
I have a question about test positivity rate. UW Virology has been at about 8%, and that's them soliciting tests from all over the nation over the last week during a time when testing was very constrained. Even then, UW only reached capacity yesterday, they had capacity to spare until then. Divide worldometers "cases per day" stat by the the CDC website testing data as of a few days ago (the most recent day they declare their numbers complete), and you get around 8%. Finally, the Friday press conference mentioned the LabQuest LabCorp tests coming online and adding significant capacity, and that their test positivity rate as of then was about 2%, and this is all during a time when testing is limited, and presumably, only the most urgent or probable cases are being sent off for testing.

I don't have a lot of exponential-math insight, but those rates seem low if the virus has spread like crazy already. How does that reconcile with the math in this article?

[+] pascalxus|6 years ago|reply
Take a look at this dashboard, specifically the bottom right section. The last 3 days, it seems that the number of reported cases is going down quite steeply. https://www.arcgis.com/apps/opsdashboard/index.html#/bda7594...

now, I know, today's number's aren't fully in yet, and there are day to day fluctuations. but, if you trust the numbers for the last 3 days, it looks like the number of cases is not increasing exponentially. I know the actual case number is much higher, but, what matters is the relative percent increase or decrease from day to day, and it does not seem to be increasing in the last 3 days. am i wrong?

Starting a few days ago, almost no one is going out, except to do their big shopping. so how can the virus still spread?

[+] kens|6 years ago|reply
The Johns Hopkins data (used in your link) has had a lot of problems the last couple of days. They've been doing various refactorings such as switching from per-county to per-state data, and it's causing a lot of issues. (It's a bit alarming that the number of cases is so high that even the people just tracking the data are getting overwhelmed.)

I've been following their data for a while and I admire their work, but I've stopped believing their numbers without double-checking. (Yes, I've filed bugs.) In addition, the last day on the graph has always made it look like cases are tapering off because it's partial data.

https://github.com/CSSEGISandData/COVID-19/issues/650

[+] twoodfin|6 years ago|reply
Starting a few days ago, almost no one is going out, except to do their big shopping. so how can the virus still spread?

I think this is a very localized view of behavior. There are many large regions, pockets, demographic slices of the country where life is going on roughly as it was a few weeks ago.

[+] trimbo|6 years ago|reply
> to get the actual number of confirmed cases today

I think for this part you meant "actual number of cases"? Since the confirmed number was 1/10th of that.

[+] eanzenberg|6 years ago|reply
The disparity of death rates among countries has more to do with infrastructure than the virus. Germany and Switzerland are proving this is just a flu, albeit a pretty bad one (0.1-0.5% death rate). More testing will prove this out by increasing the denominator by 10x.
[+] antirez|6 years ago|reply
This does not make any sense. North Italy health care is at least as good as the one in Germany and surely better than the one in Switzerland. Probably these countries are sampling a lot the general population finding many people with mild or no symptoms. In Italy because of the size of the problem the ones getting sampled are now almost solely severe cases.
[+] usaar333|6 years ago|reply
No one is getting out with a death rate that low. SK is nearing 1%.

The numbers are only so low in Germany/Switzerland because its a) growing fast and b) they are testing aggressively. More precisely (and morbidly), the death rate is low because not enough time has passed yet.