Deep learning algorithm performance in mammography screening: A systematic review and meta-analysis

Abstract only e13553 Background: Mammography interpretation presents some challenges however, better technological approaches have allowed increased accuracy in cancer diagnosis and nowadays, radiologists sensitivity and specificity for mammography screening vary from 84.5 to 90.6 and 89.7 to 92.0%,...

Full description

Saved in:
Bibliographic Details
Published inJournal of clinical oncology Vol. 39; no. 15_suppl; p. e13553
Main Authors Roela, Rosimeire Aparecida, Valente, Gabriel Vansuita, Shimizu, Carlos, Lopez, Rossana Veronica Mendoza, Tucunduva, Tatiana Cardoso de Mello, Folgueira, Guilherme Koike, Katayama, Maria Lucia Hirata, Petrini, Daniel Gustavo Pellacani, Novaes, Guilherme Apolinario Silva, Serio, Pedro Adolpho de Menezes Pacheco, Marta, Guilherme Nader, Sameshima, Koichi, Kim, Hae Yong, Koike Folgueira, Maria A. A.
Format Journal Article
LanguageEnglish
Published 20.05.2021
Online AccessGet full text

Cover

Loading…
More Information
Summary:Abstract only e13553 Background: Mammography interpretation presents some challenges however, better technological approaches have allowed increased accuracy in cancer diagnosis and nowadays, radiologists sensitivity and specificity for mammography screening vary from 84.5 to 90.6 and 89.7 to 92.0%, respectively. Since its introduction in breast image analysis, artificial intelligence (AI) has rapidly improved and deep learning methods are gaining relevance as a companion tool to radiologists. Thus, the aim of this systematic review and meta analysis was to evaluate the sensitivity and specificity of AI deep learning algorithms and radiologists for breast cancer detection through mammography. Methods: A systematic review was performed using PubMed and the words: deep learning or convolutional neural network and mammography or mammogram, from January 2015 to October 2020. All titles and abstracts were doubly checked; duplicate studies and studies in languages other than English were excluded. The remaining complete studies were doubly assessed and those with specificity and sensibility information had data collected. For the meta analysis, studies reporting specificity, sensitivity and confidence intervals were selected. Heterogeneity measures were calculated using Cochran Q test (chi-square test) and the I 2 (percentage of variation). Sensitivity and specificity and 95% confidence intervals (CI) values were calculated, using Stata/MP 14.0 for Windows. Results: Among 223 studies, 66 were selected for full paper analysis and 24 were selected for data extraction. Subsequently, only papers evaluating sensitivity, especificity, CI and/or AUC were analyzed. Eleven studies compared AUC using AI with another method and for these studies, a differential AUC was calculated, however no differences were observed: AI vs Reader (n = 3; p = 0.109); AI vs AI (n = 5; p = 0.225); AI vs AI + reader (n = 2; p = 0.180); AI + Reader vs reader (n = 2; p = 0.655); AI vs reader (n > 1) (n = 3; p = 0.102). Some studies had more than one comparison. A meta analysis was performed to evaluate sensitivity and specificity of the methods. Five studies were included in this analysis and a great heterogeneity among them was observed. There were studies evaluating more than one AI algorithm and studies comparing AI with readers alone or in combination with AI. Sensitivity for AI; AI + reader; reader alone, were 76.08; 84.02; 80.91, respectively. Specificity for AI; AI + reader; reader alone, were 96.62; 85.67; 84.89, respectively. Results are shown in the table. Conclusions: Although recent improvements in AI algorithms for breast cancer screening, a delta AUC between comparisons of AI algorithms and readers was not observed.[Table: see text]
ISSN:0732-183X
1527-7755
DOI:10.1200/JCO.2021.39.15_suppl.e13553