(no title)
macksd | 1 year ago
Because they used data that needs to be tokenized differently, and didn't really tune the models for use on that data. That's not really a limit of LLM tokenization per se.
>> We did not evaluate strategies known to improve LLM performance, including ... retrieval augmented generation
Which is a shame because this is exactly the kind of use case RAG is supposed to be good for and they largely observed problems it's supposed to help with.
Looking at the authors, it seems to me they're all subject matter experts in medicine and digital medicine, but their conclusion is the one in support of medical professionals and they really don't seem to have tried that hard to get good deep learning results.
I've had nightmares every time I've seen a doctor in the US, frequently because of things not being coded correctly. So honestly I'd just love to see a rigorous study of how often the human staff is messing it up too.
resource_waste|1 year ago
You know the Physician cartel is going to find some nonsensical reasoning to be anti-LLM even when LLMs will diagnose correctly more often.
When every other industry is finding uses, 'medical can't, its too hard', is going to be a normal line from the industry that still uses Faxes.
barfbagginus|1 year ago
This leaves a huge opportunity for open source systems which provide better diagnostics for free, but will require us to hustle. Then, the open source users among us, at least, can pioneer free and open AI augmented health care. That can become a popular option once there's strong evidence that our virtual doctors are just more competent and understanding.
I am excited about the idea that one day soon I'll have a doctor that listens to me, understands the current science and evidence about my conditions, and does not sideline me because I'm autistic, anxious, and obsessive. And we might get to help build it and make it free for everyone!