AI triages pneumothorax patients with differentiated diagnoses

A commercially available AI package has proven adept at distinguishing between two closely similar but unequally urgent conditions on chest X-rays.

In a retrospective assessment conducted at the Mass General Brigham Data Science Office, the product consistently detected life-threatening tension pneumothorax as distinct from simple pneumothorax and no pneumothorax at all in close to 1,000 patients treated at four U.S. hospitals.

The AI package, FDA-approved Annalise Enterprise CXR Triage Pneumothorax, flagged simple pneumothorax with 94% sensitivity and 92% specificity while catching tension pneumothorax with 94.5% sensitivity and 95.3% specificity.

Ground truth diagnoses were provided by consensus reads from radiologists subspecialized in thoracic studies.

James Hillis, MBBS, DPhil, Keith Dreyer, DO, PhD, and colleagues describe the project in a report published Dec. 15 in JAMA Network Open.

The authors report the model performed with high accuracy in finding and classifying pneumothorax types across demographic and technical subgroups, the latter accounting for numerous radiography manufacturers, patient positions and X-ray projections.

“This performance indicates that the model has generalizability across these subgroups, which hopefully will translate into robust performance as the model encounters further clinical [triage] scenarios moving forward,” Hillis and co-authors comment.

The AI only stumbled when it encountered certain ancillary findings commonly associated with pneumothorax such as rib fracture, air in the mediastinum, subcutaneous emphysema and evidence of thoracic surgery.

In some of these scenarios the model failed to achieve specificity of at least 80% for ruling out pneumothorax.

The authors surmise the model erred by using ancillary findings to suspect pneumothorax itself, “tricking” it into labeling the X-rays as positive for the condition even when the true diagnosis was a resolved pneumothorax.

“From these perspectives, the model is similar to a human reader who might use these ancillary findings to increase sensitivity with a cost to specificity,” Hillis and colleagues remark.

The Annalise-AI model used a deep convolutional neural network and was trained on more than 750,000 chest X-rays, the authors note.

They disclose that the vendor was involved in the study’s design, manuscript preparation and funding.

The model’s performance exceeded FDA benchmarks for computer-assisted triage devices, they point out.

Addressing limitations in study design, Hillis et al. write:

While it demonstrates the accuracy of the AI model in interpreting imaging across many demographic and technical subgroups, [the model] does not do so within the broader clinical environment. Further evaluation will be required to know how the model might impact the clinical workflow, including its impact on radiologists for case prioritization and patient outcomes. Further evaluation will also be required as the model encounters clinical scenarios beyond the current study, including from new radiographic equipment manufacturers or models.”

The study is available in full for free.

Dave Pearson

Dave P. has worked in journalism, marketing and public relations for more than 30 years, frequently concentrating on hospitals, healthcare technology and Catholic communications. He has also specialized in fundraising communications, ghostwriting for CEOs of local, national and global charities, nonprofits and foundations.

Trimed Popup
Trimed Popup