top | item 40779133

(no title)

epups | 1 year ago

The author is making the point that Alphafold 3 is not so impressive - it is simply regurgitating its train set, and it's not so good for inference.

I think his central point is fair and interesting. The test train split is apparently legit, as they used structures released before 2021 for training and the rest for testing. However, there was no real check for duplicates, and the success rate might be inflated by a bunch of "me too", low hanging fruit structures that are very slight variations from what we know.

However, I'm not sure I agree with his skepticism. LLMs suffer from the exact same problems - getting it to write a Snake game in any language is trivial, but it is almost certainly regurgitating - , but can be useful as well. I mean, if for various reasons people are publishing very similar structures out there, there's certainly value in speeding up or reducing that work considerably.

discuss

dekhn|1 year ago

AF2 and 3 both make accurate predictions of novel folds, and the larger community has confirmed this.

AF3 stands as one of the greatest achievments in machine learning/structural biology we've yet seen.

They do remove duplicates by sequence similarity (filtered PDB).

Please assume the DM folks really do know what they are doing.