(no title)
whymauri | 6 months ago
Our chemists were split: some argued it was an artifact, others dug deep and provided some reasoning as to why the generations were sound. Keep in mind, that was a non-reasoning, very early stage model with simple feedback mechanisms for structure and molecular properties.
In the wet lab, the model turned out to be right. That was five years ago. My point is, the same moment that arrived for our chemists will be arriving soon for theoreticians.
wenc|6 months ago
For instance, you can put a thousand temperature sensors in a room, which give you 1000 temperature readouts. But all these temperature sensors are correlated, and if you project them down to latent space (using PCA or PLS if linear, projection to manifolds if nonlinear) you’ll create maybe 4 new latent variables (which are usually linear combinations of all other variables) that describe all the sensor readings (it’s a kind of compression). All you have to do then is control those 4 variables, not 1000.
In the chemical space, there are thousands of possible combinations of process conditions and mixtures that produce certain characteristics, but when you project them down to latent variables, there are usually less than 10 variables that give you the properties you want. So if you want to create a new chemical, all you have to do is target those few variables. You want a new product with particular characteristics? Figure out how to get < 10 variables (not 1000s) to their targets, and you have a new product.
whymauri|6 months ago
https://www.pnas.org/doi/10.1073/pnas.1611138113
You summarized it very well!
timClicks|6 months ago
siavosh|6 months ago
svantana|6 months ago
1. https://en.wikipedia.org/wiki/Evolved_antenna
hhh|6 months ago
kmarc|6 months ago
https://www.economist.com/science-and-technology/2025/07/02/...
My understanding is, iterating on possible sequences (of codons, base pairs, etc) is exactly what LLMs, these feedback-looped predictor machines, are especially great at. With the newest models, those that "reason about" (check) their own output, are even better at it.
apimade|6 months ago
Similar for physicists, I think there’s a very confusing/unconventional antenna called the “evolved antenna” which was used on a NASA spacecraft. The idea behind it was supported from genetic programming. The science or understanding “why” the way the antenna bends at different areas supporting increased gain is not well understood by us today.
This all boils down to empirical reasoning, which underlies the vast majority of science (or science adjacent fields like software engineering, social sciences etc).
The question I guess is; does LLMs, “AI”, ML give us better hypothesis or tests to run to support empirical evidence-based science breakthroughs? The answer is yes.
Will these be substantial, meaningful or create significant improvements on today’s approaches?
I can’t wait to find out!
pojzon|6 months ago
Wouldnt that mean a fall of US pharmaceutical conglomate based on current laws about copyright and AI content?
jillesvangurp|6 months ago
selkin|6 months ago
ACCount37|6 months ago
You never quite know.
Right now, it's mostly the former. I fully expect the latter to become more and more common as the performance of AI systems improves.
brandonb|6 months ago
lukev|6 months ago
GPT-5 (and other LLMs) are by definition language models and though they will happily spew tokens about whatever you ask, they don't necessarily have the training data to properly encode the latent space of (e.g) drug interactions.
Confusing these two concepts could be deadly.
Difwif|6 months ago
To improve next token prediction performance on these datasets and generalize requires a much richer latent space. I think it could theoretically lead to better results from cross-domain connections (ex: being fluent in a specific area of advanced mathematics, quantum mechanics, and materials engineering is key to a particular breakthrough)