A couple of weeks ago, I mentioned I had some concerns about the ChestXray14 dataset. I said I would come back when I had more info, and since then I have been digging into the data. I’ve talked with Dr Summers via email a few times as well. Unfortunately, this exploration has only increased my concerns about the dataset.
WARNING: there are going to be lots of images today. If data and bandwidth are a problem for you, beware. Also, this piece is about 5000 words long. Sorry :)
DISCLAIMER: Since some people are interpreting this wrongly, I do not think this piece in any way reflects broader problems in medical deep learning, or suggests that claims of human performance are impossible. I’ve made a claim like that myself, in recent work. These results are specific to this dataset, and represent challenges we face with medical data. Challenges…
View original post 5,094 more words