Self-supervised AI ‘reads’ radiology reports to speed algorithm development
A machine learning system has come along that needs no human labeling of data for training yet matches radiologists at classifying diseases on chest X-rays—including some that the model was not specifically taught to detect.
Biomedical informaticists and computer scientists at Harvard and Stanford developed the technique, and they introduce it in a study published Sept. 15 in Nature Biomedical Engineering [1].
Corresponding author Pranav Rajpurkar, PhD, and colleagues suggest their self-supervised model points toward major savings in time and costs for AI developers, given the allotments needed to adequately annotate certain pathologies for clinical AI workflows.
Their self-supervised model bested a fully supervised model at disease detection, they report, and it “generalized to pathologies that were not explicitly annotated for model training, to multiple image-interpretation tasks and to datasets from multiple institutions.”
The team says the model, called CheXzero, effectively skips laborious image labeling by physicians and, instead, applies natural language processing to radiology reports.
Paired with the images used in algorithm training, the NLP facilitates the use of zero-shot approaches to medical image interpretation.
The results “highlight the potential of deep-learning models to leverage large amounts of unlabeled data for a broad range of medical-image-interpretation tasks,” the authors explain. In this way the burgeoning methodology “may reduce the reliance on labelled datasets and decrease clinical-workflow inefficiencies resulting from large-scale labelling efforts.”
Radiology Reports as a Natural Source of AI Supervision
To develop the method, Rajpurkar and co-researchers leveraged the fact that radiology images are “naturally labelled through corresponding clinical reports,” hypothesizing that these reports can supply a “natural source of supervision.”
They used this insight to train CheXzero on publicly available image data and radiology reports in the MIMIC-CXR database. This houses 377,110 images corresponding to 227,835 radiographic studies.
The team tested the model on chest X-rays from institutions in two countries, the idea being to see how the model performed with similar images but dissimilar nomenclature.
The testing showed CheXzero performed comparably to three benchmark radiologists classifying pathologies such as pleural effusion, edema and collapsed lung.
Further, the new model outperformed previous “label-efficient” approaches on chest X-ray pathology classification, suggesting that “explicit labels are not required to perform well on medical-image-interpretation tasks when corresponding reports are available for training.”
Additionally, the team externally validated the self-supervised model on two independent datasets.
Generalizable to a ‘Vast Array of Medical Settings’
In coverage by Harvard Medical School’s news division, Rajpurkar conveys the project’s import in lay language:
With CheXzero, one can simply feed the model a chest X-ray and corresponding radiology report, and it will learn that the image and the text in the report should be considered as similar. In other words, it learns to match chest X-rays with their accompanying report. The model is able to eventually learn how concepts in the unstructured text correspond to visual patterns in the image.”
First author Ekin Tiu, a Stanford undergraduate and visiting researcher at Harvard, adds that the team used chest radiography as a starting point, but the CheXzero model is generalizable “to a vast array of medical settings where unstructured data is the norm.”
The model’s demonstrated capability, Tiu says, “embodies the promise of bypassing the large-scale labeling bottleneck that has plagued the field of medical machine learning.”
The Harvard news item is here, and the study is available in full for free.
The study includes links to data used in the project, and the code used to train, test and validate CheXzero is here.