top | item 44523429

(no title)

govideo | 7 months ago

I'd love to hear more of our thoughts re open questions in biomedical ML. You sound like you have a crisp, nuanced grasp the landscape, which is rare. That would be very helpful to me, as an undergrad in CS (with bio) trying to crystalize research to pursue in bio/ML/GenAI.

Thank you.

discuss

panabee|7 months ago

Thanks, but no one truly understands biomedicine, let alone biomedical ML.

Feynman's quote -- "A scientist is never certain" -- is apt for biomedical ML.

Context: imagine the human body as the most devilish operating system ever: 10b+ lines of code (more than merely genomics), tight coupling everywhere, zero comments. Oh, and one faulty line may cause death.

Are you more interested in data, ML, or biology (e.g., predicting cancerous mutations or drug toxicology)?

Biomedical data underlies everything and may be the easiest starting point because it's so bad/limited.

We had to pay Stanford doctors to annotate QA questions because existing datasets were so unreliable. (MCQ dataset partially released, full release coming).

For ML, MedGemma from Google DeepMind is open and at the frontier.

Biology mostly requires publishing, but still there are ways to help.

After sharing preferences, I can offer a more targeted path.

govideo|7 months ago

ML first, then Bio and Data. Of course, interconnectedness runs high (eg just read about ML for non-random missingness in med records) and that data is the foundational bottleneck/need across the board.

Interesting anecdote abt Stanford doctors annotating QA question!

Each of your comments get my mind going... I'm going to think about them more and may ping you on other channels, per your profile. Thanks!