Radiologist outperforms AI solution at detecting fractures on X-rays

A radiologist was able to outperform one artificial intelligence solution at detecting fractures on X-rays, according to a new study published in Diagnostic and Interventional Imaging [1]. 

Standard radiograph remains the first-line modality in managing patients with bone and joint trauma in the emergency department. However, the rising number of requests in the ED has made it more difficult for physicians to “systematically” interpret these scans, sometimes forcing them to rely on help from junior rads or other specialties. One 2019 study reported a diagnostic error rate of up to 24% in EDs. 

French imaging experts sought to test the use of an AI solution (Rayvolve) in improving these numbers. The deep learning-based fracture detection algorithm achieved a sensitivity of 82% and specificity of 69%, both significantly lower than those logged by a radiologist (92% and 88%, respectively). 

“Our results reinforce the assumption that AI must be monitored,” Maxime Pastor with the Department of Medical Imaging at Nîmes University Hospital in France, and co-authors concluded. “The presence of a substantial number of false negative and false positive findings with AI demonstrates that the radiologist still has an essential role in reading radiographs for the detection of pelvic, hip and extremity fractures in adults. Finally, this study underscores the need for a strong standard of reference for the evaluation of AI solutions used for fracture detection.”

The retrospective study included 94 adult patients with suspected bone fractures who underwent a standard-dose CT exam and radiographs of the pelvis and/or hip extremities at the hospital between 2022-2023. AI was used to detect and localize bone fractures, with the results compared to reads completed by a single rad. Pastor and co-authors then used the results of CT exams interpreted by a senior radiologist as the reference standard. 

Of the study sample, 47 patients had at least one fracture and 71 were deemed present using the reference. This included 25 fractures of the hand/wrist, 16 pelvis and 30 of the foot or ankle. Using the standard reference, the AI analysis resulted in 58 true positives, 13 false negatives, 33 true negatives and 15 false positive findings. That’s compared to the radiologists’ reads, which resulted in 65 true positives, 6 false negatives, 42 true negatives and 6 false positives. The physician was able to outperform AI in terms of sensitivity (P = 0.045), specificity (P = 0.016), and accuracy (P < 0.001), the authors noted. 

The study was limited by its single-center design and small number of patients, among other factors. 

“By choosing CT as the standard of reference, we obtained a strong standard of reference for the presence or absence of fracture,” the authors noted. “This may explain the worse results of AI on radiographs in our study compared to prior publications using radiographs interpreted by specialist radiologists as the gold standard, with 91% sensitivity in a meta-analysis. This raises the question of whether radiographs should be replaced by ultra-low-dose CT, since, at an equivalent dose, better information and higher confidence in diagnosis can be obtained, especially for the extremities. Furthermore, considering the lower performance of the AI solution when CT is used as the standard of reference, this should warrant further studies with this strong standard of reference in the future for the evaluation of AI solutions used for fracture detection.”

Marty Stempniak

Marty Stempniak has covered healthcare since 2012, with his byline appearing in the American Hospital Association's member magazine, Modern Healthcare and McKnight's. Prior to that, he wrote about village government and local business for his hometown newspaper in Oak Park, Illinois. He won a Peter Lisagor and Gold EXCEL awards in 2017 for his coverage of the opioid epidemic. 

Around the web

The patient, who was being cared for in the ICU, was not accompanied or monitored by nursing staff during his exam, despite being sedated.

The nuclear imaging isotope shortage of molybdenum-99 may be over now that the sidelined reactor is restarting. ASNC's president says PET and new SPECT technologies helped cardiac imaging labs better weather the storm.

CMS has more than doubled the CCTA payment rate from $175 to $357.13. The move, expected to have a significant impact on the utilization of cardiac CT, received immediate praise from imaging specialists.