Generative AI chest X-ray models offer new approach to radiology reporting and quality improvement
Generative artificial intelligence (AI) is poised to reshape chest X-ray interpretation by moving beyond traditional algorithms that detect a single abnormality, to models capable of generating an entire radiology report from medical images. While the technology remains investigational in many practices, researchers say it has the potential to improve diagnostic accuracy, streamline AI development and strengthen radiology quality assurance.
At the Radiological Society of North America (RSNA) 2025 annual meeting, Robert Harris, PhD, a machine learning engineer at Virtual Radiologic (vRad), spoke with Radiology Business about generative AI in the above video interview. He outlined his team's work developing and evaluating a generative chest X-ray machine learning model for use within one of the nation's largest teleradiology practices.
"These types of generative models have seen an uptick in their use since the release of ChatGPT," Harris said. "Instead of targeting single pathologies like many traditional AI models, they generate an entire radiology report from the input images."
Unlike earlier computer vision algorithms that were trained to identify individual findings such as pneumonia or lung nodules, generative models use radiologists' dictated reports as the ground truth during training. That allows developers to leverage existing clinical reports rather than manually labeling thousands of images or creating detailed image segmentations.
Because every finalized radiology report already contains descriptions of clinically relevant findings, Harris said the approach enables researchers to train models on much larger datasets.
"It ends up having higher accuracy for a lot of things because you have a lot more data," he explained.
Why chest X-ray is the starting point for generative AI
Chest X-rays have emerged as one of the first imaging applications for generative AI because it presents a more manageable technical challenge than advanced imaging modalities such as CT. Harris noted that chest X-ray studies contain roughly 100 times less data than CT exams, significantly reducing the computational resources required to train large transformer-based models.
While traditional radiology AI relies heavily on convolutional neural networks optimized for detecting specific abnormalities, most generative reporting systems are based on transformer architectures similar to those powering today's large language models. Training these systems, however, still requires extensive preprocessing of radiology reports. Developers must remove inconsistencies, such as report addendums or corrected findings that could otherwise confuse the model and teach it to generate inaccurate or contradictory reports.
"There are nuances and a lot of preprocessing that needs to happen, plus careful manipulation of the data," Harris said.
He also noted that chest X-ray interpretation itself is often less definitive than CT imaging, with radiologists frequently using differential diagnoses and hedged language. That variability actually lowers the performance threshold AI must achieve to match human readers.
Real-world evaluation of generative AI underway in radiology
Although generative reporting technology has attracted considerable attention, Harris emphasized that vRad has not yet deployed its chest X-ray model into routine clinical practice. Instead, the company is conducting an investigational study to determine how the technology performs in real-world workflows and identify the most appropriate clinical applications.
"We're collecting data right now and determining what it can be used for, but our results are not in yet," he said.
Rather than purchasing commercially available algorithms, vRad has spent several years building many of its own AI tools internally. Harris said the company's decision was driven by its large-scale quality assurance (QA) program, which continually analyzes missed diagnoses and ranks them according to clinical severity.
Because vRad interprets approximately 7 million imaging studies annually, the practice encounters uncommon, but clinically significant, abnormalities often enough to justify developing AI models that many commercial vendors do not offer.
One example of this is superior mesenteric artery (SMA) occlusion, a rare but potentially catastrophic diagnosis if overlooked.
"We've identified some pathologies that are important to catch that many groups are not building algorithms for," Harris said.
These algorithms have already demonstrated success in identifying findings that otherwise may have been missed during routine interpretation, he added.
AI as a second set of eyes
According to Harris, many radiology misses involve incidental findings that fall outside the clinical question being evaluated. For example, radiologists interpreting a CT neck exams focus on the neck structures that were the reason for the study, rather than abnormalities partially visible in the upper chest or aorta at the edge of the frame. Similarly, pulmonary emboli may occasionally appear at the edge of abdominal CT scans obtained for unrelated indications.
AI systems, however, evaluate the entire image without being influenced by the ordering indication or human expectations.
"Being a machine rather than human, it's not prone to assumptions about what it should be looking for," Harris said. "Instead, it looks at the entire field of view."
That capability makes AI particularly valuable as a quality assurance tool capable of identifying subtle incidental abnormalities that might otherwise escape attention.
While the clinical role of generative reporting models remains under investigation, Harris said advances in computing power and rapidly evolving open-source and proprietary AI technologies continue to expand what is possible.
"We're excited to see where this takes us," he said. "This is a continually changing field. What's possible changes every year as new technology is deployed."