AI research published without code, data, documentation interesting to readers but unhelpful to science: RSNA pubs review
Over a five-year period ending last December, only a third of 218 scientific articles on AI in four popular radiology journals shared the researchers’ code.
Of those that did, a paltry 2% documented enough experimental data to facilitate reproducibility for subsequent studies.
The latter lack “may defeat the purpose” of sharing code at all, comment the researchers behind the findings.
Corresponding author Paul Yi, MD, of the University of Maryland, lead author Kesavan Venkatesh of Johns Hopkins and colleagues had their literature review published Aug. 17 in Radiology: Artificial Intelligence [1].
The authors state they focused on that title and three other RSNA journals for two reasons. One, the journals are widely ready by radiologists. And two, editorial board members of all four have called for open code sharing.
Of the 218 articles Yi and co-reviewers included in their study, most ran in Radiology (48%) or Radiology: Artificial Intelligence (44%). The rest were published by Radiology: Cardiothoracic Imaging (6%) and Radiology: Imaging Cancer (3%).
Findings Worrisome, But Change May Be Afoot
Along with finding code sharing in 73 of the 218 articles (34%), the authors report that only 24 these 73 (33%) had adequate documentation for code implementation.
Also:
- Just 29 of the 218 reviewed articles (13%) shared data, while 12 of the 73 data-sharing articles (41%) offered complete experimental data by using only public domain datasets.
- Four of the 218 articles—the previously noted 2%—displayed both code and complete experimental data.
- Code sharing rates were statistically higher in 2020 and 2021 compared with earlier years and were greater in Radiology and Radiology: Artificial Intelligence compared with the other RSNA journals in the study.
While noting that AI code and data sharing has recently improved in the reviewed journals, Yi and co-authors cite previous findings across multiple journals consistent with their own.
“Altogether, these findings are worrisome, as sharing code is a key component to facilitating transparent and reproducible science in AI research,” they remark.
Upward Trends Emerging in Both Code Sharing and Code Documentation
At the same time, the authors scanned the landscape and flagged several developments that point to an overall course correction.
They spotlight the Ten Years Reproducibility Challenge launched in 2019, which actively encourages researchers to update and revalidate code from papers published 10 or more years prior (also see here).
Yi and team also give a shoutout to the online database Papers with Code and a “reproducibility/completeness checklist” spawned by the database project.
In addition, leading AI research conferences “have adopted such guidelines as requirements for official submission in addition to hosting annual paper reproducibility challenges,” the authors write.
Indeed, although the so-far low rates of code sharing and code documentation “may be discouraging,” Yi and co-authors state, “our study showed upward trends of both over time, demonstrating the field’s growing recognition of the importance of these practices.”
More:
Nonetheless, there is room for improvement, which can be facilitated by journals and the peer review process. For example, reproducible code sharing can be improved by radiology journals through mandatory code and documentation availability upon article submission, reproducibility checks during the peer review process, and standardized publication of accompanying code repositories and model demos.”
RSNA notes the Yi et al. study, presently posted in a “Just Accepted” section, has undergone peer review but is awaiting copyediting and proofing en route to publication in Radiology: Artificial Intelligence.