How Radiology Partners is using large language models to monitor AI deployment

Dave Fornell | February 03, 2026 | Radiology Business | Artificial Intelligence

Large language models are increasingly being used in radiology to extract data from narrative reports to improve workflow, and to help validate, monitor and improve other artificial intelligence tools being used in clinical practice.

Walter Wiggins, MD, PhD, a neuroradiologist and director of clinical AI at Rad Partners, hosted a hands-on session on this topic at the Radiological Society of North America annual meeting and described in an interview with Radiology Business how the technology works.

“We use large language models to extract information about specific findings from radiology reports in order to compare that information to an output from a vision AI tool in order to either conduct pre-deployment monitoring, or pre-deployment validation or post-deployment monitoring, or even to curate data for training and AI model,” Wiggins said.

While the work is technically sophisticated, he emphasized it has clear real-world value for clinical practices deploying AI.

“We feel strongly at [Rad Partners technology division] Mosaic that if you are going to deploy an AI tool in your practice, that you gather some data and test it on the model ahead of time,” Wiggins explained.

Using LLMs to extract findings from reports allows practices to better understand baseline AI performance, including metrics radiologists care about most. He said the things rads pay attention to are positive predictive value and the sensitivity of an AI model. It also is important to know how often the AI is going to pick up on pathology when it is in an image, and that should be reflected in the radiology report. Wiggins said this process also helps identify where models underperform.

At Rad Partners, these methods are already embedded into daily operations.

“We have a number of tools that we've deployed across the practice and we do integrate this into our clinical training for radiologists, but we also have monitoring running in the background on all the AI tools we have deployed,” Wiggins said.

That monitoring helps the group understand how radiologists interact with AI, whether they are appropriately rejecting incorrect results, and whether the tools are actually improving detection of important findings. Without this data, it is difficult to understand how helpful the AI is.

“If the model's not helping you improve in your detection of important clinical findings, then perhaps it's not something that's providing the value you're expecting in your practice,” he said.

Wiggins also noted that AI can help boost radiologists' confidence and efficiency. This help often comes as a second set of eyes to help detect or confirm subtle findings when radiologists are uncertain or deciding if something is an image artifact.

For practices considering AI adoption, Wiggins offered clear guidance. “The strong recommendation is that you do pre-deployment validation,” he said.

This includes outlining a process for sharing representative data with vendors under proper HIPAA protections and performing human-in-the-loop reviews. That approach helps distinguish true AI errors from human reporting errors, identify impressive cases, and to help uncover instances where radiologists should be more skeptical of AI outputs.

Integration challenges may hinder trust in radiology decision support tools

Stanford Radiology’s AI Lab inks strategic partnership with Rad Partners

HHS seeks input on how reimbursement, regulation could bolster use of healthcare AI

Nvidia sees major shift in radiology to AI agents and new autonomous imaging systems

Radiology dominates FDA-cleared AI, but reimbursement lags far behind

Dave Fornell

Dave Fornell has covered healthcare for more than 17 years, with a focus in cardiology and radiology. Fornell is a 5-time winner of a Jesse H. Neal Award, the most prestigious editorial honors in the field of specialized journalism. The wins included best technical content, best use of social media and best COVID-19 coverage. Fornell was also a three-time Neal finalist for best range of work by a single author. He produces more than 100 editorial videos each year, most of them interviews with key opinion leaders in medicine. He also writes technical articles, covers key trends, conducts video hospital site visits, and is very involved with social media. E-mail: [email protected]