When training AI to classify chest x-rays, is more data always better?
Convolutional neural networks (CNNs) trained with 20,000 labeled images can accurately classify chest x-rays as normal or abnormal, according to new findings published in Radiology. Training the CNN with an additional 180,000 images, the authors noted, only yielded “marginal” benefits.
“This work could be clinically important both by permitting radiologists to spend more time on abnormal studies and by demonstrating a simple mechanism to combine physician judgment with deep learning algorithms such as CNNs in a manner that can improve interpretation performance,” wrote lead author Jared A. Dunnmon, from the department of computer science at Stanford University in Stanford, California, and colleagues.
The authors trained a CNN with 20,000 labeled chest x-rays from between 1998 and 2012, reporting an average area under the receiver operating characteristic curve (AUC) of 0.95. To show the impact a dataset can have on a CNN, the team also assessed the AUC using 200,000 images and 2,000 images. The model trained with 200,000 images had an average AUC of 0.95, and the model trained with 2,000 images had an average AUC of 0.84.
These results, Dunnmon and colleagues explained, show that providers unable to obtain access to hundreds of thousands of images for training purposes can still find significant success developing CNNs.
“The results of our study should be validated in other patient populations; however, our findings suggest a distinct value to combining deep-learning techniques, such as CNNs, with data sets of sizes already accessible to many institutions to improve thoracic imaging triage,” the authors wrote.
The team also said their findings were similar to prior research by Gulshan et al. published in JAMA, which found that a CNN’s performance “reached a plateau after approximately 60,000 images.”