Two years ago, after DeepMind submitted its first set of predictions to CASP (Critical Assessment of protein Structure Prediction), Mohammed AlQuraishi, an expert in the field, asked, "What just happened?"
Now that the problem of static protein structure prediction has been solved (prediction errors are below the threshold that is considered acceptable in experimental measurements), we can confidently answer AlQuraishi's question:
Protein Folding just had its "ImageNet moment."
In hindsight, AlphaFold v1 represented for protein structure prediction in 2018 what AlexNet represented for visual recognition in 2012.
Sometimes announcements like this are a bit over-the-top. But what really, to me, cements the 'big-deal' of this is the "Median Free-Modelling Accuracy" graph half way down the page.
Scores of 30-45 for 15 years. Now scores of 87-92.
This isn't a minor improvement, it's a leap forward.
Not knowing a lot about biotechnology, I read the article and it sounds great, but how big is this as a gamechanger? Can someone comment on how big are the implications of this in, let’s say, 5 years from now, on day to day life? Does this mean that biotech is going to explode? Or just that drugs will come to market faster, perhaps cheaper for rare diseases, but from the same industry structure as always?
I continue to be impressed by how quickly DeepMind has managed to progress in such a short time.
CASP13 was a shocker to all of us I think, but many were skeptical as to the longevity of the performance DeepMind was able to achieve. I believe with CASP14 rankings now released, it's safe to say that they've proven themselves.
Congratulations to the team! This work will have far reaching impacts, and I hope that you continue to invest heavily in this area of research.
Just to add to this whole "It's not solved! Yes it is!" discussion. Note that
>According to Professor Moult, a score of around 90 GDT is informally considered to be competitive with results obtained from experimental methods.
So if we go by >= 90 as solved:
>In the results from the 14th CASP assessment, released today, our latest AlphaFold system achieves a median score of 92.4 GDT overall across all targets.
they solved for their targets, but
>Even for the very hardest protein targets, those in the most challenging free-modelling category, AlphaFold achieves a median score of 87.0 GDT (data available here).
They basically admit they still haven't "solved" it for "most challenging free-modelling category"
Take that as you will, not sure how useful the ">= 90 is solved" criteria is since they call it "informal" themselves.
CASP (Critical Assessment of protein Structure Prediction) is calling it a solution. To quote from the article:
"We have been stuck on this one problem – how do proteins fold up – for nearly 50 years. To see DeepMind produce a solution for this, having worked personally on this problem for so long and after so many stops and starts, wondering if we’d ever get there, is a very special moment."
--Professor John Moult
Co-founder and chair of CASP
"AlphaFold achieves a median score of 87.0 GDT". Game changing, and a huge improvement, but not 100% solved. Also this is for static folding. Dynamic folding and interaction is a much harder problem. Those need to be tackled too before I would consider protein folding 'solved'.
12-13 years ago in a classroom the professor for my intro to bioinformatics class said if you were to solve this problem, you would win a Nobel prize. Congrats to the team! What an achievement.
Man, I remember running folding@home years ago on my terrible laptop. Now this was done with what they say is equivalent to only 100-200 GPUs. Crazy to see how far we've come in just a short amount of time.
Pretty interesting that they only used about $15k worth of resources (retail price) to achieve this. It's not a technique that would have been out of reach for other organizations based only on not being able to afford the compute.
The vast majority of structures in the protein data bank are determined by crystallography, which involves putting the protein in a chemical cocktail that causes it to crystallize. The cocktail is very different to the chemical environment in which the protein functions, so an open question is whether the protein structure determined by crystallography (and hence learned by AlphaFold) is representative of the structure in it's natural environment.
It would be very interesting if there was a way to use computational techniques to go beyond what crystallography and other experimental techniques (Cryo) can accomplish and determine the protein structure in it's true biological setting. Some research into experimental methods for this include high power X-ray pulses.
This is a lot bigger than people are assuming if protein folding can be done quickly and cheaply it will trickle down to a lot more than medicine. It is going to advance bio fuels, food production and a lot more.
My conclusion reading this is that a gradient is a gradient is a gradient. If you can minimize one, you can minimize them all. The hard work would seem to be figuring out how to transform into a gradient that your hardware can solve. It will also be interesting to see the kinds of systematic errors that will come as a result of the biases in the training set, and whether it can be used to predict what the structures would look like under slightly different conditions (e.g. pH).
I worked in the lab that helped develop folding@home, as well as the game where the crowd was the chaotically trained machine that folded and unfolded one amino acid at a time. This feels like a pretty significant new chapter in the humanity movie.
A few times, I get immense pangs of jealousy for younger people a generation or a half before me. And I'm only 30! This is one of those times.
This is amazing, if we can simulate multi-protein interactions, you could imagine in our lifetimes being able to see a fully computation driven simulation of a human blood cell. That would be a huge breakthrough.
This is a big step forward, but the outstanding question as far as to whether or not this is useful for evaluating novel proteins, is going to be how good is the confidence metric at telling the user to trust or not trust the results. You can see from their examples, that AlphaFold is very good but not perfect. I imagine for some proteins it will still give misleading or erroneous results and if you can’t tell when that happens without verifying the structure experimentally then this will likely not be that useful for new science.
We indeed stand on the shoulders of a small number of giants! I'm infinitely thankful for the work DeepMind is doing. Lets maybe celebrate this accomplishment for one day and start being worried about big tech again tomorrow. Many of the comments here usually suggest that we should live in worries and fear but to my knowledge there is not too much historical evidence for these kind of companies turning evil.
This sounds big, like really really big. At least from my old times providing my idle computing resources to Folding@Home and following that project, this seems like the major golden milestone for protein folding.
So, sorry to be a philistine but what specific discoveries will this lead to... will it make it easier to produce antivirals or even molecular machines?
This is a huge jump forward. Last year's performance already was a big step up over the previous, and this seems to go much further. So big kudos to the research team.
Nonetheless, I'd like to hear more from specialists outside the context of a marketing blog post before I fully buy into a claim of a solution.
There's also a rabbit hole about what 'solution' actually means. Is the performance sufficient for any protein folding prediction application that might arise in the future?
Anyone care to muse about appropriate investment strategies based on the not previously feasible research approaches that might now be possible?
Should we expect to see faster progress in large well capitalized bioscience companies -- or a sudden increase in the viability of smaller biotech and/or biotech startups ...? Are we gonna see top talent fleeing the old biotech companies to start their own ventures with a new belief that the potential for huge reward might suddenly seem achievable?
What kind of companies do we think will be the first that are able to translate this new knowledge into profits?
Some comments were deferred for faster rendering.
dang|5 years ago
https://news.ycombinator.com/item?id=25253488&p=2
We changed the URL from https://predictioncenter.org/casp14/zscores_final.cgi to the blog post, which has more background info.
cs702|5 years ago
https://moalquraishi.wordpress.com/2018/12/09/alphafold-casp...
Now that the problem of static protein structure prediction has been solved (prediction errors are below the threshold that is considered acceptable in experimental measurements), we can confidently answer AlQuraishi's question:
Protein Folding just had its "ImageNet moment."
In hindsight, AlphaFold v1 represented for protein structure prediction in 2018 what AlexNet represented for visual recognition in 2012.
mabbo|5 years ago
Scores of 30-45 for 15 years. Now scores of 87-92.
This isn't a minor improvement, it's a leap forward.
harperlee|5 years ago
mncharity|5 years ago
(submitted by furcyd : https://news.ycombinator.com/item?id=25254888 ).
partingshots|5 years ago
Congratulations to the team! This work will have far reaching impacts, and I hope that you continue to invest heavily in this area of research.
vadansky|5 years ago
>According to Professor Moult, a score of around 90 GDT is informally considered to be competitive with results obtained from experimental methods.
So if we go by >= 90 as solved:
>In the results from the 14th CASP assessment, released today, our latest AlphaFold system achieves a median score of 92.4 GDT overall across all targets.
they solved for their targets, but
>Even for the very hardest protein targets, those in the most challenging free-modelling category, AlphaFold achieves a median score of 87.0 GDT (data available here).
They basically admit they still haven't "solved" it for "most challenging free-modelling category"
Take that as you will, not sure how useful the ">= 90 is solved" criteria is since they call it "informal" themselves.
comicjk|5 years ago
"We have been stuck on this one problem – how do proteins fold up – for nearly 50 years. To see DeepMind produce a solution for this, having worked personally on this problem for so long and after so many stops and starts, wondering if we’d ever get there, is a very special moment."
--Professor John Moult Co-founder and chair of CASP
EgoIncarnate|5 years ago
mylons|5 years ago
yarabarla|5 years ago
jeffbee|5 years ago
gabia|5 years ago
It would be very interesting if there was a way to use computational techniques to go beyond what crystallography and other experimental techniques (Cryo) can accomplish and determine the protein structure in it's true biological setting. Some research into experimental methods for this include high power X-ray pulses.
Nonetheless, impressive work!
xbmcuser|5 years ago
tgbugs|5 years ago
dluan|5 years ago
A few times, I get immense pangs of jealousy for younger people a generation or a half before me. And I'm only 30! This is one of those times.
Quarrel|5 years ago
mensetmanusman|5 years ago
woeirua|5 years ago
jjk166|5 years ago
nmca|5 years ago
xyzal|5 years ago
chetan_v|5 years ago
TrackerFF|5 years ago
WanderPanda|5 years ago
breck|5 years ago
https://moalquraishi.wordpress.com/2018/12/09/alphafold-casp...
piva00|5 years ago
andy_ppp|5 years ago
phonebucket|5 years ago
Nonetheless, I'd like to hear more from specialists outside the context of a marketing blog post before I fully buy into a claim of a solution.
There's also a rabbit hole about what 'solution' actually means. Is the performance sufficient for any protein folding prediction application that might arise in the future?
m-p-3|5 years ago
breatheoften|5 years ago
Should we expect to see faster progress in large well capitalized bioscience companies -- or a sudden increase in the viability of smaller biotech and/or biotech startups ...? Are we gonna see top talent fleeing the old biotech companies to start their own ventures with a new belief that the potential for huge reward might suddenly seem achievable?
What kind of companies do we think will be the first that are able to translate this new knowledge into profits?
lucidrains|5 years ago