The promise and potholes of computer vision as viewed from a radiological vantage point
Radiologists are known to miss comical oddities planted in medical images. In one study conducted in 2013, 83% of rads famously missed a man in a monkey suit waving hello from inside a lung scan.
The fake ape was 48 times the size of the nodules the physicians had been asked to inspect. Worse, eye-tracking software showed most of the overlookers had rested their vision on the very spot occupied by the character.
A writer at Harvard Medicine magazine rehashes the anecdote not to re-shame radiology but to explain the concept and principles of computer vision for a general audience.
To do so, she first offers a justification for the technology’s existence: “inattentional blindness,” which is ubiquitous among humans. Where better to explore the condition’s potential costs than in radiology?
“Studies suggest that error rates in radiology, which hover around 4% for all images, and up to 30% among abnormal images, have remained relatively unchanged since the 1940s,” notes the writer, Molly McDonough, citing a 2016 study. “Medical imaging is therefore among the fields in which computer vision is touted as a way to decrease error rates by picking up on the clues that humans miss—or even someday doing the job solo.”
When computer vision met generative AI
Broadly defined, computer vision is a branch of AI focused on still and moving images. As generative AI has exploded in use over the past year and a half, it’s melded with computer vision to let algorithms create new images and videos from text, images or both.
One snag in the technology’s development as an image interpreter: AI can have blind spots just like people. But the problem isn’t traceable to fatigue, eye strain, repetitiveness or other contributors to inattentional blindness.
Rather, the challenge in getting algorithms to flawlessly nail optical targets lies in the extreme complexity inherent in training a machine to see like humans see.
The degree of difficulty here is easy to appreciate when one considers that the retina is part of the brain, spinal cord and, by extension, the entire spinal cord.
Brainlike behavior aped but not bested
Like the layers of neurons in our brains, McDonough writes, computational units in visual deep learning systems are “responsible for processing small, overlapping regions of an image, detecting features like shapes and textures.”
These so-called convolutional layers, she notes, are “interspersed with layers that generalize information so the computer can combine simple features into increasingly complex ones, as well as with layers that process the information to complete a given task.”
As elucidated for the article by Harvard psychologist Talia Konkle, PhD, “The level of complexity of the biological system so far overshadows these models—it’s not even funny.”
“But as information processing systems,” Konkle adds, “there are deep principles about how [computer vision models] solve the problem of handling the rich structure of visual input that really seem to be like how the human visual system is solving the problem, too.”
There’s a good deal more. Read the whole thing.