Data augmentation for medical imaging: A systematic literature review

Recent advances in Deep Learning have largely benefited from larger and more diverse training sets. However, collecting large datasets for medical imaging is still a challenge due to privacy concerns and labeling costs. Data augmentation makes it possible to greatly expand the amount and variety of...

Full description

Saved in:

Bibliographic Details
Published in	Computers in biology and medicine Vol. 152; p. 106391
Main Authors	Garcea, Fabio, Serra, Alessio, Lamberti, Fabrizio, Morra, Lia
Format	Journal Article
Language	English
Published	United States Elsevier Ltd 01.01.2023 Elsevier Limited
Subjects	Artificial neural networks Data augmentation Datasets Deep Learning Diagnostic Imaging Generative adversarial networks Image segmentation Information technology Internal Medicine Literature reviews Machine learning Medical imaging MRI Neural networks Neural Networks, Computer Other Regularization Systematic review Training Visual tasks Deep learning Data augmentation Generative adversarial networks Medical imaging MRI
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Recent advances in Deep Learning have largely benefited from larger and more diverse training sets. However, collecting large datasets for medical imaging is still a challenge due to privacy concerns and labeling costs. Data augmentation makes it possible to greatly expand the amount and variety of data available for training without actually collecting new samples. Data augmentation techniques range from simple yet surprisingly effective transformations such as cropping, padding, and flipping, to complex generative models. Depending on the nature of the input and the visual task, different data augmentation strategies are likely to perform differently. For this reason, it is conceivable that medical imaging requires specific augmentation strategies that generate plausible data samples and enable effective regularization of deep neural networks. Data augmentation can also be used to augment specific classes that are underrepresented in the training set, e.g., to generate artificial lesions. The goal of this systematic literature review is to investigate which data augmentation strategies are used in the medical domain and how they affect the performance of clinical tasks such as classification, segmentation, and lesion detection. To this end, a comprehensive analysis of more than 300 articles published in recent years (2018–2022) was conducted. The results highlight the effectiveness of data augmentation across organs, modalities, tasks, and dataset sizes, and suggest potential avenues for future research. •Data augmentation is beneficial across all organs, modalities and tasks.•Highest increase in performance associated to data augmentation for heart, lung and breast.•Generative models and transformations have complementary roles and strengths.•Affine and pixel-level transformations achieve the best trade-off between performance and complexity.•Learnable data augmentation techniques remain unexplored.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 content type line 14 ObjectType-Literature Review-2 ObjectType-Feature-3 ObjectType-Article-2 ObjectType-Undefined-1 ObjectType-Review-4 content type line 23
ISSN:	0010-4825 1879-0534 1879-0534
DOI:	10.1016/j.compbiomed.2022.106391