(no title)
praccu | 4 years ago
A key challenge: very few labs have enough data.
Something I view as a key insight: a lot of labs are doing absurdly labor intensive exploratory synthesis without clear hypotheses guiding their work. One of our more useful tasks turned out to be interactively helping scientists refine their experiments before running them.
Another was helping scientists develop hypotheses for _why_ reactions were occuring, because they hadn't been able to build principled models that predicted which properties were predictive of reaction formation.
Going all the way to synthesis is nice, but there's a lot of lower hanging fruit involved in making scientists more effective.
entee|4 years ago
Also shameless plug: I started a company to do just that, anchored to generating custom million-to-billion point datasets and using ML to interpret and design new experiments at scale.
probably_wrong|4 years ago
It is also getting harder, not easier, to get.
I am working right now on a retro synthesis project. Our external data provider is raising prices while removing functionality, and no one bats an eye. At the same time our own data is considered a business secret and therefore impossible to share.
As someone who does NLP research where the code, data and papers are typically free, this drives me insane.
cinntaile|4 years ago
czbond|4 years ago
hashimotonomora|4 years ago
kortex|4 years ago
You might just decide that a compound "needs" isopropanol/acetone, plus a bit of water, cause something vaguely similar you encountered years ago crystallized well. You often start with some educated guesses and refine based on what you see.
But there's often no clear hypothesis, no single physical law the system obeys.
kilroy123|4 years ago
Would love to chat more with you about this.
malux85|4 years ago
formerly_proven|4 years ago
This lets you stumble over unknown unknowns. Taylor et al discovered high-speed steel by ignoring the common wisdom and doing a huge number of trials, arriving at a material and treatment protocol that improved over the then-state-of-the-art tool steels by an order of magnitude or more. The treatment mechanism was only understood 50-60 years later.