AI triage system fails to improve radiologist performance or turnaround times
A commercially available artificial intelligence triage system for stroke care does not appear to improve radiologists’ diagnostic performance nor their report turnaround times, according to new research published Wednesday.
Intracranial hemorrhage is a major cause of injury and death across the globe, with accurate and timely detection critical for patient survival. Health systems are increasingly utilizing computer-aided notification systems to assist in the process. However, studies on these products have been limited by their retrospective design and other factors, researchers detailed in the American Journal of Roentgenology [1].
Experts with the University of Alabama at Birmingham sought to solve this dilemma, conducting a prospective evaluation of a product from Aidoc. Testing the AI tool—across nearly 10,000 noncontrast head CT scans logged in 2021—produced disappointing results. Radiologists’ accuracy for detecting such brain bleeds decreased with AI, while average report turnaround times for positive studies increased.
“In conclusion, use of a commercial AI triage tool did not improve radiologists’ real-world diagnostic performance for detecting [intracranial hemorrhage] on head [noncontrast CT] examinations or report process times for ICH-positive examinations,” Cody H. Savage, MD, with the UAB Heersink School of Medicine’s Department of Radiology at the time of the study, and co-authors wrote Sept. 4. “Additionally, radiologists alone had greater diagnostic performance than AI alone for ICH detection. Such findings challenge the intended benefits from implementation of AI triage system.”
The prospective, single-center study included all adult patients who underwent noncontrast head CT between May and December of 2021. Savage and colleagues separated the investigation into two parts: phase 1 from May to June before using AI triage software and phase 2 from September to December afterward. The product works by processing CT exams and notifying radiologists of positive results through a widget with a floating pop-up display. Emergency and neuro radiologists interpreted the images, with a panel of rads reviewing all exams to assess disagreements between rad reports and the AI results.
A total of 9,954 CT scans from 7,371 patients were included in the analysis. Across the two phases, examinations were between 19.8% to 21.9% positive for intracranial hemorrhage. Radiologists showed no significant difference in accuracy at detecting ICH, logging a rate of 99.5% without artificial intelligence versus 99.2% with it. Specificity was higher for radiologists without the help of AI (99.8%) compared to those who used the software (99.3%). Average report turnaround time for positive exams was 147.1 minutes without AI versus 149.9 minutes when using the program.
The findings conflict with previous retrospective studies that have demonstrated positive outcomes stemming from stroke triage software, the authors noted. One potential reason might be the high performance of radiologists included in the study, with accuracy of 99.5% and sensitivity of 98.6%. Another could be the inclusion of emergency and neuro experts, whereas previous analyses tracked diagnostic improvements among nonradiologist physicians. Savage and co-authors also criticized the use of a widget that appears outside of a radiologists’ worklist and may have slowed turnaround times.
“The lack of improvement in report process times for ICH-positive examinations with AI is counter to a key intended purpose of the AI triage system,” the authors noted. “This finding is important, as the rapid identification of the presence (or lack) of ICH guides early treatment decisions. Such decisions may require diagnosis within a specific time frame in order to initiate certain treatments that improve survival or mitigate disability.”
Aidoc on Wednesday criticized the study’s results, noting that the single-center design “significantly restricts generalizability.” The authors also falsely assumed that a triage solution is intended to “drastically enhance radiologist performance in the short term,” noted Jerome Avondo, PhD, VP of clinical research and reimbursement at Aidoc.
“However, AI’s primary role is often as a safety net, helping catch edge cases that might be missed during typical workflows,” he told Radiology Business by email. “These critical edge cases, although rare, can lead to lifesaving interventions. Expecting AI to show a dramatic improvement in overall accuracy over a short period may not reflect the intended benefit of the technology, which lies in long-term patient safety.”
Avondo also noted that Aidoc has updated the algorithm since 2021, obtaining FDA clearance for a new version in May 2022.
“While we respect the findings presented in the study and applaud the researchers for their efforts in advancing the understanding of AI in radiology, it’s important to note a few limitations to the methods and also in its design that may impact its conclusions,” Avondo cautioned. “The main value of AI tools like Aidoc’s is to quickly identify potential cases, leading to faster patient care. While detecting missed cases is helpful, the primary goal is to speed up treatment. The paper's focus on time to report finalization doesn't accurately reflect clinical practice, as results are often communicated verbally immediately and the report is finalized later,” he added further down in the emailed statement.