GPT-4 can detect radiology report errors at the same rate as members of the specialty

GPT-4 can detect radiology report errors at the same rate as members of the specialty, according to new research published Tuesday.

Such mistakes can occur because of resident-to-attending discrepancies, speech-recognition software inaccuracies, and hefty physician workloads, experts wrote in Radiology. To assess the large language model’s skill at spotting errors, experts compiled 200 reports (including X-rays and cross-sectional CT/MR imaging) at a single institution. They inserted 150 errors from five common categories (i.e., omission, insertion, spelling, etc.) and tasked GPT-4 and six radiologists with spotting them.

The LLM was able to match radiologists’ performance, regardless of their experience, scoring a detection rate of nearly 83% compared to 89% for senior readers, 80% for attending physicians, and 80% among residents.  

“This efficiency in detecting errors may hint at a future where AI can help optimize the workflow within radiology departments, ensuring that reports are both accurate and promptly available, thus enhancing the radiology department’s capacity to deliver timely and reliable diagnostics,” lead author Roman J. Gertz, MD, resident in the Department of Radiology at University Hospital of Cologne, Germany, said in an April 16 announcement from RSNA.

Physicians assessed in the study included two senior radiologists, two attendings and two residents. One of the experienced readers was able to outperform GPT-4, scoring a detection rate of nearly 95%. However, the LLM also required less processing time per report when compared to the fastest human radiologist in the study (at an average reading time of about 3.5 seconds vs. 25 seconds). The use of GPT-4 also resulted in a lower average correction cost per report than the most cost-efficient radiologist (at about $0.03 vs. $0.42).

“Ultimately, our research provides a concrete example of how AI, specifically through applications like GPT-4, can revolutionize healthcare by boosting efficiency, minimizing errors and ensuring broader access to reliable, affordable diagnostic services—fundamental steps toward improving patient care outcomes,” Gertz said in the announcement.

Read more in the flagship journal of the Radiological Society of North America at the link below.

Marty Stempniak

Marty Stempniak has covered healthcare since 2012, with his byline appearing in the American Hospital Association's member magazine, Modern Healthcare and McKnight's. Prior to that, he wrote about village government and local business for his hometown newspaper in Oak Park, Illinois. He won a Peter Lisagor and Gold EXCEL awards in 2017 for his coverage of the opioid epidemic. 

Around the web

The nuclear imaging isotope shortage of molybdenum-99 may be over now that the sidelined reactor is restarting. ASNC's president says PET and new SPECT technologies helped cardiac imaging labs better weather the storm.

CMS has more than doubled the CCTA payment rate from $175 to $357.13. The move, expected to have a significant impact on the utilization of cardiac CT, received immediate praise from imaging specialists.

The all-in-one Omni Legend PET/CT scanner is now being manufactured in a new production facility in Waukesha, Wisconsin.