> More than 10,000 novel variant sequences are currently discovered every week and human experts simply cannot cope with complex data at this scale
This is interesting. I think the Greek alphabet naming system would lead some people to believe the virus only mutates once every few months. Of course, the reality is that every infected individual will produce hundreds of mutations within their body. I think there's a gap in the public messaging here which if addressed could help people understand what the future direction of the pandemic might be.
This is the kind of education that needs to be done in high school during health and biology classes. We can’t hope the general public will become epidemiologically literate on 240 character tweets and 30-second television quotes. Heck, my old high school biology teacher remains a source of coronavirus FUD, so maybe that’s not even enough.
Public messaging often includes the more extensive PANGO lineage (e.g., B.1.1.529 for Omicron). I think there is a limit to what can be conveyed in each news story.
Impressive if true, although I am very sceptical of the results. The real test is to put their neck on the line and predict/warn of the next significant variant. If the authors are not willing to do this, I doubt that the algorithm is useful.
As a side note, the first author on this paper is the CEO of InstaDeep. I see it as a red flag that the CEO of a 150+ person company would put themselves as first author on this paper. Perhaps I'm unfairly judging and the CEO really was the lead contributor to the study.
I am the second author of the paper. We have been putting our head on the line for the last half a year. We detected Lambda, Mu (with a caveat, that we did not consider it competitive) and Omicron - all blindly.
We have been verifying all our predictions experimentally, post factum. And the method is purely data driven - with no fitting to the experiments or observations.
The first author came up with the approach, participated in analysis and got his hands dirty as everyone else. While he is a CEO of a 160+ person company, this has been a labor of love for all of us, done to a large extent in the evenings, during weekends and holidays. It is indeed an unusual situation. But this was not a regular project and InstaDeep is not a regular company either.
Because of both founder effects and potentially people taking earlier containment actions, I’m unsure that’s a realistic measure.
It’s like how people said Y2K was just a conspiracy theory because most systems kept running — without realizing the early warnings and preparations is the reason Y2K wasn’t a civilizational disruption.
> As a side note, the first author on this paper is the CEO of InstaDeep. I see it as a red flag that the CEO of a 150+ person company would put themselves as first author on this paper. Perhaps I'm unfairly judging and the CEO really was the lead contributor to the study.
Authors on scientific papers are ordered by name, so whoever's name is first is pure coincidence.
We are actively looking for motivated colleagues. Feel free to look at the page above or drop us an email at hello[at]instadeep.com.
We are also happy to make new friends and work together on exciting projects - same email as above works.
I'm all for stopping new variants of COVID. Why in the world do we have a treatment for COVID that intentionally creates mutations in COVID https://en.wikipedia.org/wiki/Molnupiravir ? I'm not a doctor but it seems pretty clear to me that Molnupiravir creates a great environment for new variants to arise (along with cancer, but that's another issue).
> Molnupiravir is indicated for the treatment of mild-to-moderate coronavirus disease (COVID-19) in adults with positive results of direct SARS-CoV-2 viral testing, and who are at high risk for progression to severe COVID-19.[1][5]
So after the risk of creating mutations in the Covid virus has been assessed and weighed against the chances of patients just outright dying due to not having this medicine available, the board full of medical professionals trained in this matter voted 13 to 10 that they thought the risk was acceptable.
That's very likely not even possible[1]; and, it not being realistic, would be a waste of time, effort, and money. There are lots of corona viruses that we have never stopped, and we live with via herd immunity, hygiene, and therapeutics[2].
Because it helps keep people alive and the risks are judged to be worth the cost in high-risk patients. Hopefully Paxlovid will obsolete this one soon enough though.
> When using a weekly watch-list with a size of 20 variants (less than 0.5% of the weekly average of new variant sequences), EWS flagged 12 WHO designated variants out of 13 (Fig. 4.A), with an average of 58 days of lead time (i.e two months) before these were designated as such by the WHO (Table S.4).
> Our system however does not accurately pinpoint the emergence of the B.1.617.2 Delta family of variants. Delta is known to be neutralised by vaccines24 and its global prevalence can be attributed to other fitness-enhancing factors [than immune escape]. These factors, such as P681R mutation, which abrogates O-glycosylation, thus further enabling furin cleavage, are outside of the scope of our approach.
> Specifically, the EWS identified Omicron as the highest immune escaping variant over more than 70,000 variants discovered between early October and late November 2021.
>More than 10,000 novel variant sequences are currently discovered every week and human experts simply cannot cope with complex data at this scale
I suspected something like this, given the frequency of viral mutation. Officials announce a dominant global strain but what proportion of positive cases are actually sequenced and evaluated for confirmation? How many undiscovered strains are actually in circulation at any given time, in geographically isolated areas? Could that explain variability in severity and/or long covid?
This is not my area and I read the press release but not the paper -- but I cannot help noticing that they mention a sensitivity/recall number (>90%) but not a specificity or precision number.
Even if you're not trying to be cynical or skeptical about this, when there are this few true positive examples available, how can one plausibly do a good job calibrating such a system?
That's the key bit here. Supervised learning is not applicable here and any sort of fitting to known labels is doomed to fail.
The system is not calibrated, tuned or parameterized. The ML part learns in a self-supervised manner what are the spike sequence features important for a successful (proliferating) coronavirus (or - more correctly - learns low-dimensional embeddings of multi-point interactions/co-occurrences of different amino acids in spike proteins). The rest is based on either frequentist statistics, or computational biochemistry.
At no point in training EWS any information about certain sequences belonging to High Risk Variant classes (Variant of Concern, Variant of Interest etc.) is fed to EWS.
Given that we still don’t know the mechanism by which variants achieve their increased fitness I’m a little suspicious of claims of using “AI” to identify new variants.
Claims that spike affinity for ACE2 (or indeed changes to the spike protein at all) haven’t been conclusively proven to increase fitness as far as I know. It’s possible that this tool is simply overfit to the known variants and wouldn’t detect a new one.
I’d be interested to hear what real virologists think of this.
I assume this is also part of a product development process: if the EWS flags a particular variant, BioNTech can begin crafting a new vaccine component and have it ready sooner, earning a competitive advantage in the (capitalist) vaccine marketplace.
Is this literally a joke? This might as well be a doomsday machine driven by weak AI selecting new Greek glyphs to drive demand for new superfluous covid vaccines.
I'm fully vaccinated, but more data points to come up with media fear mongering is the last thing anyone in the developed world needs right now.
Not exactly "finding out", but rather estimating. It's all about prioritizing testing through detecting suspiciously "good" and unexpected outliers popping up.
The idea is to introduce a bit of sanity into the current world, to calm down overreactions to every new, scary looking variant. And on the flip side - to identify the truly scary ones early on, so that proper measures can be taken.
But the idea above directly follows from the work in the paper.
spuz|4 years ago
This is interesting. I think the Greek alphabet naming system would lead some people to believe the virus only mutates once every few months. Of course, the reality is that every infected individual will produce hundreds of mutations within their body. I think there's a gap in the public messaging here which if addressed could help people understand what the future direction of the pandemic might be.
jdavis703|4 years ago
carbocation|4 years ago
im3w1l|4 years ago
axg11|4 years ago
As a side note, the first author on this paper is the CEO of InstaDeep. I see it as a red flag that the CEO of a 150+ person company would put themselves as first author on this paper. Perhaps I'm unfairly judging and the CEO really was the lead contributor to the study.
orintorynchus|4 years ago
We have been verifying all our predictions experimentally, post factum. And the method is purely data driven - with no fitting to the experiments or observations.
The first author came up with the approach, participated in analysis and got his hands dirty as everyone else. While he is a CEO of a 160+ person company, this has been a labor of love for all of us, done to a large extent in the evenings, during weekends and holidays. It is indeed an unusual situation. But this was not a regular project and InstaDeep is not a regular company either.
jdavis703|4 years ago
It’s like how people said Y2K was just a conspiracy theory because most systems kept running — without realizing the early warnings and preparations is the reason Y2K wasn’t a civilizational disruption.
mlindner|4 years ago
Authors on scientific papers are ordered by name, so whoever's name is first is pure coincidence.
solididiot|4 years ago
adsodemelk|4 years ago
orintorynchus|4 years ago
fasteddie31003|4 years ago
WJW|4 years ago
> Molnupiravir is indicated for the treatment of mild-to-moderate coronavirus disease (COVID-19) in adults with positive results of direct SARS-CoV-2 viral testing, and who are at high risk for progression to severe COVID-19.[1][5]
So after the risk of creating mutations in the Covid virus has been assessed and weighed against the chances of patients just outright dying due to not having this medicine available, the board full of medical professionals trained in this matter voted 13 to 10 that they thought the risk was acceptable.
kerneloftruth|4 years ago
That's very likely not even possible[1]; and, it not being realistic, would be a waste of time, effort, and money. There are lots of corona viruses that we have never stopped, and we live with via herd immunity, hygiene, and therapeutics[2].
[1] https://www.acsh.org/news/2020/11/05/covid-why-we-will-never...
[2] https://www.cdc.gov/coronavirus/general-information.html
aftbit|4 years ago
snet0|4 years ago
newsclues|4 years ago
Because creating mutations and prolonged pandemics are more profitable.
alexwg|4 years ago
est31|4 years ago
> When using a weekly watch-list with a size of 20 variants (less than 0.5% of the weekly average of new variant sequences), EWS flagged 12 WHO designated variants out of 13 (Fig. 4.A), with an average of 58 days of lead time (i.e two months) before these were designated as such by the WHO (Table S.4).
> Our system however does not accurately pinpoint the emergence of the B.1.617.2 Delta family of variants. Delta is known to be neutralised by vaccines24 and its global prevalence can be attributed to other fitness-enhancing factors [than immune escape]. These factors, such as P681R mutation, which abrogates O-glycosylation, thus further enabling furin cleavage, are outside of the scope of our approach.
> Specifically, the EWS identified Omicron as the highest immune escaping variant over more than 70,000 variants discovered between early October and late November 2021.
twofornone|4 years ago
I suspected something like this, given the frequency of viral mutation. Officials announce a dominant global strain but what proportion of positive cases are actually sequenced and evaluated for confirmation? How many undiscovered strains are actually in circulation at any given time, in geographically isolated areas? Could that explain variability in severity and/or long covid?
unknown|4 years ago
[deleted]
abeppu|4 years ago
orintorynchus|4 years ago
dopylitty|4 years ago
Claims that spike affinity for ACE2 (or indeed changes to the spike protein at all) haven’t been conclusively proven to increase fitness as far as I know. It’s possible that this tool is simply overfit to the known variants and wouldn’t detect a new one.
I’d be interested to hear what real virologists think of this.
maxerickson|4 years ago
Like what is the harm in doing the data collection and publishing about how it is going?
just_steve_h|4 years ago
bolbols|4 years ago
71a54xd|4 years ago
I'm fully vaccinated, but more data points to come up with media fear mongering is the last thing anyone in the developed world needs right now.
consumer451|4 years ago
wesletimk|4 years ago
n380|4 years ago
orintorynchus|4 years ago
liuliu|4 years ago
orintorynchus|4 years ago
But the idea above directly follows from the work in the paper.
1cvmask|4 years ago
https://www.instadeep.com/2022/01/biontech-and-instadeep-dev...
jcfrei|4 years ago
[deleted]
mrtri|4 years ago
[deleted]
wesletimk|4 years ago
[deleted]
a45a33s|4 years ago
[deleted]
dam53434|4 years ago
[deleted]
dam53434|4 years ago
[deleted]
defaultprimate|4 years ago
[deleted]
schleck8|4 years ago