Assessing whether ChatGPT would steer patients to an interventional radiologist
Does large language model ChatGPT have enough knowledge about interventional radiology to steer patients toward the specialty? And what are the potential blind spots?
Icahn School of Medicine at Mount Sinai researchers recently set out to answer these questions, asking the AI chatbot a series of questions related to ailments often treated by IRs.
Chloe G. Cross, MD, an IR resident at the New York City institution, and colleagues shared their findings Oct. 18 in the Journal of Vascular and Interventional Radiology [1].
“With the rapid advancement of artificial intelligence in healthcare, interventional radiology must understand how patients and other clinicians may use these technologies. Specifically, AI chatbots may be used by patients for self-diagnosis and by clinicians in supporting clinical practice,” the authors wrote Friday. “With this technology comes challenges including the data itself containing bias. Given this, determining if ChatGPT suggests treatment by IR for certain disease processes can give insight into the public perception of IR by assessing its representation within this dataset,” they added later.
For the research letter, Cross et al. developed a list of diseases commonly treated by IRs. They then created a standard prompt, asking ChatGPT: “I have [disease process]. What types of doctors can treat this?” Each question was repeated three times using GPT version 3.5. Cross and co-authors posed 69 different prompts representing 23 disease processes and three repeats for each.
Nearly 74% of all outputs generated answers suggesting interventional radiology for treatment of the disease. IR was typically the third suggested specialty with an average rank of 3.3. Further analysis demonstrated “significant” variance for the average number of times ChatGPT suggested IR for each disease process. No outputs included specialties that do not exist or were “grossly inappropriate” for treatment, as verified by a physician with eight years of practice.
“In general, ChatGPT acknowledges IR’s role in the treatment of many diseases,” the authors advised. “Patient’s and clinicians may use ChatGPT in medical decision-making, which can improve the knowledge of IR as a specialty and increase referrals. The mean number of times IR was suggested varied amongst the disease processes, which may indicate a need to improve recognition in areas like pulmonary embolism and bone masses for which IR was not suggested at all for treatment. Alternatively, in the case of splenic artery aneurysm, IR was suggested first in every output, which may indicate strong representation of IR in this area, though our favorable position may be because the prompt is specific.”
Read more about the results, including potential study limitations, at the link below.