top | item 43497612

(no title)

FanaHOVA | 11 months ago

The non-tinfoil hat approach is to simply Google "Boston demographics", and think of how training data distribution impacts model performance.

> The data set used to train CheXzero included more men, more people between 40 and 80 years old, and more white patients, which Yang says underscores the need for larger, more diverse data sets.

I'm not a doctor so I cannot tell you how xrays differ across genders / ethnicities, but these models aren't magic (especially computer vision ones, which are usually much smaller). If there are meaningful differences and they don't see those specific cases in training data, they will always fail to recognize them at inference.

discuss

No comments yet.