A Transformer-Based Framework for Tiny Object Detection

This paper proposes a fully transformer-based method for building an end-to-end model dedicated to tiny object detection. Our approach eliminates the components which are difficult to be designed in detecting tiny objects, such as anchor generation and non-maximum suppression. Additionally, we addre...

Full description

Saved in:

Bibliographic Details
Published in	2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) pp. 373 - 377
Main Authors	Liao, Yi-Kai, Lin, Gong-Si, Yeh, Mei-Chen
Format	Conference Proceeding
Language	English
Published	IEEE 31.10.2023
Subjects	Asia Buildings Codes Deformable models Information processing Object detection Transformers
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper proposes a fully transformer-based method for building an end-to-end model dedicated to tiny object detection. Our approach eliminates the components which are difficult to be designed in detecting tiny objects, such as anchor generation and non-maximum suppression. Additionally, we address the issue of receptive fields for tiny objects in convolutional neural networks through self-attention. The model named Swin-Deformable DEtection TRansformer (SD DETR) integrates Swin Transformer [1] and Deformable DETR [2]. Furthermore, we have introduced architectural enhancements and optimized the loss function to improve the model's ability in detecting tiny objects. Experimental results on the AI-TOD [3] dataset demonstrate that SD DETR achieves 10.9 AP for very tiny objects with only 2 to 4 pixels, showcasing a significant improvement of +1.2 AP compared to the current state-of-the-art model. The code is available at https://github.com/kai271828/SD-DERT
ISSN:	2640-0103
DOI:	10.1109/APSIPAASC58517.2023.10317511