Robust Data Augmentation Generative Adversarial Network for Object Detection

Generative adversarial network (GAN)-based data augmentation is used to enhance the performance of object detection models. It comprises two stages: training the GAN generator to learn the distribution of a small target dataset, and sampling data from the trained generator to enhance model performan...

Full description

Saved in:

Bibliographic Details
Published in	Sensors (Basel, Switzerland) Vol. 23; no. 1; p. 157
Main Authors	Lee, Hyungtak, Kang, Seongju, Chung, Kwangsue
Format	Journal Article
Language	English
Published	Switzerland MDPI AG 23.12.2022 MDPI
Subjects	Ablation Data augmentation Datasets disentangled representation learning generative adversarial network Image processing image-to-image translation Liquors Medical imaging equipment Medical research Neural networks object detection Object generation Performance evaluation Remote sensing generative adversarial network image-to-image translation object detection data augmentation disentangled representation learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Generative adversarial network (GAN)-based data augmentation is used to enhance the performance of object detection models. It comprises two stages: training the GAN generator to learn the distribution of a small target dataset, and sampling data from the trained generator to enhance model performance. In this paper, we propose a pipelined model, called robust data augmentation GAN (RDAGAN), that aims to augment small datasets used for object detection. First, clean images and a small datasets containing images from various domains are input into the RDAGAN, which then generates images that are similar to those in the input dataset. Thereafter, it divides the image generation task into two networks: an object generation network and image translation network. The object generation network generates images of the objects located within the bounding boxes of the input dataset and the image translation network merges these images with clean images. A quantitative experiment confirmed that the generated images improve the YOLOv5 model's fire detection performance. A comparative evaluation showed that RDAGAN can maintain the background information of input images and localize the object generation location. Moreover, ablation studies demonstrated that all components and objects included in the RDAGAN play pivotal roles.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1424-8220 1424-8220
DOI:	10.3390/s23010157