top | item 13488914

(no title)

Shinkei | 9 years ago

Really happy you wrote this anecdote because it illustrates a point that physicians are arguing about a lot right now--how to measure quality care. Is it readmission rates, is it 30-day mortality, is it getting all the required screening exams for each patient... we don't know. In the US, the CMS is trying to mandate many of these and it's really hard to measure quality of care.

For example, I am a radiologist. Probably 90% of my cases are mundane and are either normal or have 'easy' pathology that I can readily detect and quickly report. Another 9% is a mixed bag of things that take a lot more time--something I need to reference or think a bit more about or possibly show a colleague. And then there's the 1% that is truly a 'make' or 'break' case. My sub-specialty training and experience can really shine in these situations and I can easily dispense a diagnosis or make a 'tough call' where another might equivocate or defer to more imaging or a follow-up. I like to think I'm not being paid for that 90%... I'm being paid for the other 9+1% That's where I truly add value in the system. But how would I measure my 'success' rate when even the best eventually make mistakes given a long enough time and there is often not a gold standard for diagnosis or even long term follow-up/resolution to many of the tough cases I've seen.

Studies have been done where experts review a sampling of cases and that's essentially what we do now for quality control--we randomly review a few prior exams for a case we are reading and then submit feedback based on our opinion on the same case. In these studies, trained radiologists do very well and make clinically significant mistakes only rarely.

But here's the problem. Let's say we use machine learning to interpret a scan, something like a CT of the chest to look for a pulmonary embolus. On that scan, the machine may see an incidental pulmonary nodule. Well that's fine, we have good data on how to follow-up on those and what to recommend. Well, what about an anterior mediastinal lesion? Now it's not as clear. The differential is wide and dependent on age, sex, symptoms, history, etc. Let's say we build that logic tree, let the machine learn from 1000 anterior mediastinal mass cases with tissue diagnosis. Well guess what, we do those studies all the time and it turns out that NOTHING is 100% sensitive AND 100% specific. You have to sacrifice one for the other... it's a balance. So, you would need to build ROC curves for every possible finding on every scan and decide that 1% of the time you will miss a cancer to save having to biopsy an extra 20 people... or maybe you want to miss cancer only 0.1% of the time, but that means you'll have to biopsy an extra 200 people and one will have to be hospitalized from complications. You think they will be happy knowing you made that decision? Guess what, we are already doing that with Mammography. The screening guidelines that the different societies and agencies argue about are this very rationale... how many women should be allowed to die to prevent those extra biopsies, false positive workups and all the things that come with that. I welcome machine learning into medicine. Guess what, it's destined to be mired in the same ethical dilemmas we face everyday.

I have a feeling I'll make it to retirement.

discuss

coryrc|9 years ago

Yes, much better to make ad-hoc decisions of additional tests versus missing disease, without the benefit of a double-checked system giving you and the patient the likelihoods of complications.

Shinkei|9 years ago

I appreciate the sarcastic comment, but I don't think you understand the implication of a system like that. Even if you have more data, it doesn't better inform a patient's decision process.

For example, let's say you could--you can't--but let's say you could without a shadow of a doubt predict that the pulmonary nodule in your lung is not a cancer with 98% certainty. Well if you are 40, that's not that good actually.... that means 2 out of 100 people in a very productive time of their lives may have a cancer go completely ignored! So should I tell every patient in my report, "There is a 2% chance that this is malignant, but I won't recommend biopsy because there are chances of complications from that and we can save a lot of money by letting a few slip through the cracks--it will cost the healthcare system too much. Thanks for your understanding."

Remember, statistics predict population outcomes... not individual outcomes. I can tell someone that something very rare might happen.. but guess what, when it happens... the idea that it was a rare possibility doesn't assuage any negative feelings about it.

There is no right answer. Some people are illiterate! Even educated people don't understand statistics... how am I going to quantify that kind of risk/benefit analysis in a way that ensures a patient truly understands the implications. What if that risk were 1%, or what if the patient was 70 years old? Should either of those affect my recommendation? Who am I to decide who should be recommended one thing vs another... it's a value judgement! But if I leave it solely to the patient, a lot of times they will ask me, "What would you do?" Probably the most common thing asked after a long discussion like that. The answer is, "I don't know."

Bottom line: We can do 'strong' recommendations for things that are well studied like breast cancer and pulmonary nodules, but we don't have data to support recommendations in many other areas of everyday practice. A machine learning system would need data that just isn't available yet to make recommendations.