If properly done, that approach would most likely generate a model that works really well and detects COVID for the right reasons. Put it out in the wild and it breaks down again because all radiologist would have to use the exact same method of scanning patients. Practice shows that scan parameters, position, annotations vastly differ across hospital, scanner and radiologist.A client of mine wanted me to look into this dataset and help creating a model for detecting COVID. One of the doctors they where working with wanted to be able to submit pictures they took (from their point and shoot camera) of the CT scan. Good luck putting that variation in your dataset.
new299|4 years ago
If all other parameters area randomized between case and control, this is also fine. I'd guess you can also add illumination and other artifacts to the datasets to make the training robust to this.
But ultimately in the extreme case (like wanting to submit images from a point and shoot camera) I suspect you'll have a hard time building a system that's robust to that... or robust enough to be used in a diagnostic context...
I personally don't think I'd be comfortable working without a standard configuration used by the radiologists and a controlled protocol.