Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening

We present a deep convolutional neural network for breast cancer screening exam classification, trained, and evaluated on over 200000 exams (over 1000000 images). Our network achieves an AUC of 0.895 in predicting the presence of cancer in the breast, when tested on the screening population. We attr...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on medical imaging Vol. 39; no. 4; pp. 1184 - 1194
Main Authors	Wu, Nan, Phang, Jason, Park, Jungkyu, Shen, Yiqiu, Huang, Zhe, Zorin, Masha, Jastrzebski, Stanislaw, Fevry, Thibault, Katsnelson, Joe, Kim, Eric, Wolfson, Stacey, Parikh, Ujas, Gaddam, Sushma, Lin, Leng Leng Young, Ho, Kara, Weinstein, Joshua D., Reig, Beatriu, Gao, Yiming, Toth, Hildegard, Pysarenko, Kristine, Lewin, Alana, Lee, Jiyon, Airola, Krystal, Mema, Eralda, Chung, Stephanie, Hwang, Esther, Samreen, Naziya, Kim, S. Gene, Heacock, Laura, Moy, Linda, Cho, Kyunghyun, Geras, Krzysztof J.
Format	Journal Article
Language	English
Published	United States IEEE 01.04.2020 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Artificial neural networks Biomedical imaging Breast - diagnostic imaging Breast cancer breast cancer screening Breast Neoplasms - diagnostic imaging Cancer Cancer screening Classification deep convolutional neural networks Deep Learning Early Detection of Cancer - methods Female Humans Image classification Image Interpretation, Computer-Assisted - methods Image resolution Labels Machine learning Malignancy Mammography Mammography - methods Medical imaging Medical screening Neural networks Predictions Predictive models Radiologists Subpopulations Task analysis Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	We present a deep convolutional neural network for breast cancer screening exam classification, trained, and evaluated on over 200000 exams (over 1000000 images). Our network achieves an AUC of 0.895 in predicting the presence of cancer in the breast, when tested on the screening population. We attribute the high accuracy to a few technical advances. 1) Our network's novel two-stage architecture and training procedure, which allows us to use a high-capacity patch-level network to learn from pixel-level labels alongside a network learning from macroscopic breast-level labels. 2) A custom ResNet-based network used as a building block of our model, whose balance of depth and width is optimized for high-resolution medical images. 3) Pretraining the network on screening BI-RADS classification, a related task with more noisy labels. 4) Combining multiple input views in an optimal way among a number of possible choices. To validate our model, we conducted a reader study with 14 readers, each reading 720 screening mammogram exams, and show that our model is as accurate as experienced radiologists when presented with the same data. We also show that a hybrid model, averaging the probability of malignancy predicted by a radiologist with a prediction of our neural network, is more accurate than either of the two separately. To further understand our results, we conduct a thorough analysis of our network's performance on different subpopulations of the screening population, the model's design, training procedure, errors, and properties of its internal representations. Our best models are publicly available at https://github.com/nyukat/breast_cancer_classifier .
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	0278-0062 1558-254X 1558-254X
DOI:	10.1109/TMI.2019.2945514