top | item 18645109

AlphaFold at CASP13: What just happened?

188 points| sytelus | 7 years ago |moalquraishi.wordpress.com | reply

61 comments

order
[+] 317070|7 years ago|reply
>What is worse than academic groups getting scooped by DeepMind? The fact that the collective powers of Novartis, Merck, Pfizer, etc, with their hundreds of thousands (~million?) of employees, let an industrial lab that is a complete outsider to the field, with virtually no prior molecular sciences experience, come in and thoroughly beat them on a problem that is, quite frankly, of far greater importance to pharmaceuticals than it is to Alphabet. It is an indictment of the laughable “basic research” groups of these companies, which pay lip service to fundamental science but focus myopically on target-driven research that they managed to so badly embarrass themselves in this episode.

I wonder, is this because these methods are simply 'not good enough' to really have an application for medicine yet? I know nothing of the pharmaceutical sector, but saying they don't do basic research seems to stretch my world view given their vast profit baselines and government funding for exactly that purpose. Is there someone in the field who knows more?

[+] cowsandmilk|7 years ago|reply
For the general question of pharma investing in structure prediction, I think participants in CASP overestimate the importance of structure. It is nice to have and there certainly are structure-driven projects, but docking is so poor that often computational models of how a molecule binds, even when you have a structure of a protein, are unreliable and there are plenty of case studies of them sending teams in the wrong direction. This would only be worse in the case of AlphaFold since, as the post shows, GDT_HA is still quite poor.

From my experience in research, pharma has found that cellular models and phenotypic assays are far more meaningful for pushing projects forward. So, there is far more interest in applying machine learning to that data than for building protein structures. And those same methods can be applied to target-based projects regardless of whether you have structure. And regardless of how flexible your protein is. Huge portions of structure-based modeling has no ability to deal with protein flexibility, even if you know there are open and closed conformations of the protein or a loop that adopts half a dozen configurations.

Basically, academics working on folding often believe far too much in the importance of structure in drug discovery. The author appears to fall into that category.

[+] loopasam|7 years ago|reply
I worked a few years in pharma R&D (Roche, largest R&D budget in this industry).

In a pharma setting, the 3D structure of a protein is mostly used to perform drug design (https://en.wikipedia.org/wiki/Drug_design#Computer-aided_dru...), i.e. trying to understand how a chemical will physically interact with a protein and thereby modify it's physiological function, in order to treat a disease.

The biggest problem comes from the fact that proteins are (1) non-static and very flexible and (2) don't exist in vacuum, they interact with a myriad of other entities in a living system. In other words, it's not because you know the structure of a protein and how to theoretically perturb it with a small molecule, that you have a drug. The large majority of structures predicted to be active against a protein target are not, when tested in a biological assay. The process helps, but ultimately it's a very empirical endeavor (test a ton of different chemicals in actual experiments, try to abstract some logic and move on from that). As a result, simply knowing the structure of a protein will not get you far down the line into finding a new drug.

On the resource topic: Even in a very large pharma setting, you will find only a dozen of scientists or so dedicated to the topic (out of tens of thousands employees), supporting many projects and with very little time to perform their own research. As a result, any team fully dedicated to the problem (like AlphaFold) can easily over-compete pharma. Most of the cost in drug discovery comes from dealing with patients and clinical trial. It's only at this stage that you'll know how your drug really works, and how it fits in the existing market and society (think of neuroscience for instance).

I don't want to undermine the protein structure field and AlphaFold results (it's fascinating), but pharma business model de facto relies very little on knowing a protein structure or not. It's also mostly useful to design small molecules, a class a bit out of fashion (biologics are the top-sellers in 2018, and new modalities are coming-up, like RNAs and gene editing for instance).

[+] throwawaylolx|7 years ago|reply
I don't know why the author is so critical of his peers. DeepMind didn't come up with a novel biological insight; they simply pointed their unparalleled AI resources towards a deep learning problem. Is it really surprising that a team of world class deep learning scientists with virtually infinite resources managed to outperform pharmaceutical companies at a deep learning problem? I don't think so.
[+] moomin|7 years ago|reply
The crucial observation is how their value chain works. Most of the value is in restricting the distribution of treatments. That, in turn, is achieved by getting medicines approved in the US.

There’s a lot of value near the end of the development cycle, and not near the start.

(I conclude from this that there remains a role for government in directly funding scientific research.)

[+] ramraj07|7 years ago|reply
You're right that the methods are simoly not good enough. Further a large fraction of pharma research has shifted to biologics which doesn't depend on structure prediction for new compounds, etc but they only need the hi res structure of their target protein (which more often than not is already done) to go on and generate the antibodies and make sure they direct the desired motif on the target.
[+] sanxiyn|7 years ago|reply
I think you have a point. OP mentions "target-driven research". If you have a fixed target, you can do crystallography and get the structure directly, you don't need structure prediction. That is, I think pharma's core interest is closer to particular structures, not a general method to predict structures.
[+] nopinsight|7 years ago|reply
AlphaFold is a prime example on the importance of cross-pollination between fields and the need to fund diversified research approaches, as well as inventing fundamentally new tools that are potentially applicable in a wide range of fields.

It appears that the bandwagon effect in science is real and unfortunately too prevalent. Conservatism is a powerful institutional force that directs prestige and importantly funding away from 'fringe' approaches. Fundamental innovation often stems from maverick thinkers who still need time and resources to prove their ideas' worth but too much resource (funding, time, talent, publication venues) tends to gravitate towards eking out 1% better performance from mainstream ideas (while important, industry will often fund this kind of research anyway).

In life sciences, SENS for example already took almost two decades just to start to establish itself and is still far from mainstream. All the while, 3-4 orders of magnitude more resources are expended to obtain those 1% improvements based on mainstream treatments with little chances of yielding significantly better health outcomes for the patient.

Decision science should compel us to invest more resources on risky projects with high upsides. Humanity can afford to risk investing 15-20% of research resources, if not more, to explore fundamentally new approaches.

https://en.m.wikipedia.org/wiki/Strategies_for_Engineered_Ne...

[+] kieckerjan|7 years ago|reply
One has to admire the candor with which he talks about what must feel to him and many of his colleagues as an existential threat to his career and/or life's work. Impressive. In business (as in academia I guess) there is this constant nagging fear of being blindsided by a well-funded or brilliant competitor. When that happens I personally just want to get drunk or roll up into a ball or something.
[+] throwawaylolx|7 years ago|reply
>as an existential threat to his career and/or life's work.

Is it though? DeepMind stood on the shoulders of the giants: they made use of decades of biological research and wet lab experiments. There's much more to academic research than predictive data science, which is honestly not meant to be their expertise at all. It is exactly their research that enabled DeepMind to reduce the problem to a solvable deep learning problem, and it is still their research that can best leverage the results of the model. I think there's an important distinction between properly understanding and pursuing the science behind the problem and taking data and fitting an already formulated problem through a deep network.

[+] breatheoften|7 years ago|reply
> Second, regarding the question of how academic groups should respond scientifically to DeepMind’s entry, I suspect the right answer comes from evolution: adapt. Focus on problems that are less resource intensive, and that require key conceptual breakthroughs and less engineering.

I disagree with this notion a bit - the idea that academic researchers interested in managing their career should consider focusing on problems that are less resource intensive in the future to avoid having to compete with the resource advantage of a Deep Mind ...

I feel I've seen this position analogously expressed in various forms over time in the internet space "don't try to compete with $huge_tech_company on the internet because $scale challenges that they invested $megabucks into solving". This kind of statement was made over and over again as the internet explosion was going on. If you actually go back and look at the details you'll find stories like 'google built $inhouse technology so they could scale to 100 * x visits per day" for increasing values of x depending on the article publish date. But if you look again at that article in the context of a year or two after it was published, getting to x scale for the cited problem has become trivial with nearly off the shelf hardware and software designs.

It seems to me that computing resource advantage is something that should actually _nearly always diminish_ over time, especially when there is widely known/appreciated understanding of the value of the resource ...

Scientific research (in many fields) has been performance and design-test-efficiency bottlenecked for a long, long time -- the fact that there are now wider software trends able to support breaking past those bottlenecks in specific problems is not evidence to me that there is now a resource Emperor to whom all others must cow in fear, forever unable to compete head to head in related spaces ... if anything, the fact that more efficient computation approaches are viable _right now_ and known to be valuable, can spread into the academic groups as easily as the domain expertise of academia spread into Deep Mind's first foray int this problem space ...

[+] sytelus|7 years ago|reply
Great insights from someone in the field. AlphaFold improvement is equivalent to combining past 2 CASP improvements. Ruminations on huge advantage that industrial labs have with top engineering talent and compute resources makes author wonder if its worth continue in academic lab.

“If I were to pick, I think about half of the performance improvement we see in AlphaFold comes from the simple ideas above, and about half from the sophisticated engineering of the distance-predicting neural network.”

“...with DeepMind’s entry I will have to reconsider, and from conversations with others this appears to be a nearly universal concern. Just like in machine learning, for some of us it will make sense to go into industrial labs, while for others it will mean staying in academia but shifting to entirely new problems or structure-proximal problems that avoid head-on competition with DeepMind.”

“...competitively-compensated research engineers with software and computer science expertise are almost entirely absent from academic labs, despite the critical role they play in industrial research labs. Much of AlphaFold’s success likely stems from the team’s ability to scale up model training to large systems, which in many ways is primarily a software engineering challenge. “

“For DeepMind’s group of ~10 researchers, with primarily (but certainly not exclusively) ML expertise, to so thoroughly route everyone surely demonstrates the structural inefficiency of academic science.”

[+] kohanz|7 years ago|reply
I'm sure there's a good explanation for it, but calling a conference that occurs in 2018 "CASP13" seems like an unnecessary way to create some confusion.
[+] dekhn|7 years ago|reply
I would say that pretty much any time a team comes in and improves CASP results over baseline it's a win. However, traditionally, it's been too hard for regulars to reproduce the results of the winning team- it's not simply reduced to a github repo you can run to generate new accurate structures, like some recent advances in image and drug discovery have been.

Papers are nice, but github code that runs is gold.

[+] cs702|7 years ago|reply
Another field is being upended by deep learning methods, that's what happened.

A small team from DeepMind with mainly AI/ML expertise won first place at a prominent academic competition, besting teams of experts by a surprising margin.

Academics who have invested a lifetime studying and working on the problem are suddenly wondering if their skills and experience are at risk of becoming less relevant. They're wondering, how could a bunch of neophytes pull this off? Is this just the opening salvo?

A natural reaction will be to dismiss this as "nothing new, just better engineering and clever hacking with more resources." Another reaction will be to dismiss deep learning techniques as "curve-fitting without insight."

Such dismissals are misguided in my view. Judging by how quickly deep learning methods have become the dominant approach for producing state-of-the-art results in other fields, I would expect the same thing to happen in this field.

[+] bluGill|7 years ago|reply
Computers have ALWAYS been about solving problems. I have long told others that when going into computer science type fields that you should get a double major (or at least a minor) in some other field because you are most useful when you can take your computer knowledge and apply it to some completely different field.

I didn't take my own advice... My career has been about learning other fields to apply computers to it, and it has been a very interesting journey. Note fields is plural there, I have not stayed in the same industry (I think this means something but I don't know what)

[+] corporateguy6|7 years ago|reply
I personally think that machine learning will have a broad effect on the legitimacy of the pharmaceutical industries and health care in general. Once real data about all of these drugs, treatments, and expensive procedures finally starts being collected and exposed, the public will see things as they are, a farce.
[+] jcranmer|7 years ago|reply
If the pharmaceutical industry were willing to fake efficacy results, surely they could do better than having over ½ their drugs fail Phase III efficacy trials. Especially since one thing everyone agrees on is that the best way to lower drug costs is to figure out which drugs are going to fail Phase III much, much earlier.
[+] viraptor|7 years ago|reply
> real data about ... collected and exposed

Which is not a machine learning issue. Just collecting the information is a costly / time consuming issue which can't be outsourced to servers.

[+] sanxiyn|7 years ago|reply
For drugs, we already do postmarketing surveillance. Whatever you think of pharmas, drugs are not a farce.
[+] VikingCoder|7 years ago|reply
"Automation is scary," says the very smart person, "for people who do menial labor. Their jobs will be replaced... I'm so glad I work in a STEM field that requires the kinds of thinking machines will never be able to do!"