Both radiologists and AI struggle to identify 'deepfake' X-rays
Both radiologists and artificial intelligence struggle to distinguish “deepfake” radiographic images from authentic ones, according to new research published in RSNA’s Radiology.
Deepfake images are created or manipulated with the help of AI and designed to look like the real thing. Until recently, the term had not typically been used in medical settings—it is something more often associated with celebrities who have been the victim of deepfake images depicting them in inappropriate, often lewd scenarios. But now, experts are cautioning that these falsified images could have implications for radiologists, too.
“Our study demonstrates that these deepfake X-rays are realistic enough to deceive radiologists, the most highly trained medical image specialists, even when they were aware that AI-generated images were present,” cautioned lead study author Mickael Tordjman, MD, a post-doctoral fellow at the Icahn School of Medicine at Mount Sinai, New York. “This creates a high-stakes vulnerability for fraudulent litigation if, for example, a fabricated fracture could be indistinguishable from a real one. There is also a significant cybersecurity risk if hackers were to gain access to a hospital’s network and inject synthetic images to manipulate patient diagnoses or cause widespread clinical chaos by undermining the fundamental reliability of the digital medical record.”
To get a better idea of whether deepfake images were realistic-looking enough to deceive radiologists, the team tasked 17 rads from 12 centers across 6 countries with distinguishing AI images from authentic ones. They were presented with two sets of radiographs, each of which contained a blend of AI-generated images (created by either GPT-4o or RoentGen) and real patient exams.
Readers were first told to assess the quality of the images and to provide a diagnosis. Next, they were informed of the presence of AI-generated images and were asked to determine whether specific exams were real of fake.
When assessing image quality, just 41% of the readers identified images that had been created or altered by AI. After the purpose of the study was revealed, the readers achieved a mean accuracy of 75%. Radiologist experience level did not appear to impact their accuracy, but the group noted that musculoskeletal radiologists performed significantly better than other subspecialties.
The team also prompted different large language models to identify AI-generated images, but the LLMs struggled as well. None were able to identify all of the synthetic images, though GPT-4o and GPT-5 both spotted more than the human readers.
The team highlighted several features that were common among AI-generated images, including bilateral symmetry, uniform grain or noise patterns, subtly unnatural soft-tissue textures, and overly smooth bone surfaces. However, they cautioned that these features alone may not be enough to equip radiologists with the skills necessary to differentiate between what is real and what is fake.
“We are potentially only seeing the tip of the iceberg,” Tordjman said. “The logical next step in this evolution is AI-generation of synthetic 3D images, such as CT and MRI. Establishing educational datasets and detection tools now is critical."
The full study is available here.
