An encoder-decoder network for crowd counting based on multi-scale attention mechanism

Crowd counting is a challenging computer vision task, which is widely used in video surveillance and public safety applications. With the increase of camera resolution and the complexity of crowd image, it becomes an important problem to predict crowd density and crowd count accurately. Recent CNN-b...

Full description

Saved in:
Bibliographic Details
Published inMultimedia tools and applications Vol. 84; no. 3; pp. 1187 - 1210
Main Authors Chuang, Hao-Hsiang, Chen, Yi-Cheng, Lin, Chang Hong
Format Journal Article
LanguageEnglish
Published New York Springer US 01.01.2025
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Crowd counting is a challenging computer vision task, which is widely used in video surveillance and public safety applications. With the increase of camera resolution and the complexity of crowd image, it becomes an important problem to predict crowd density and crowd count accurately. Recent CNN-based density estimation methods have shown effectiveness in densely populated scenes. In this paper, we present a novel approach to crowd counting through the development of an Encoder-Decoder Multi-Scale Attention Network. Our approach leverages the robust U-net architecture as the backbone network, strengthened by the strategic integration of an attention mechanism. We adopt a multi-scale attention method to each different layers in the U-net backbone to make the network extract features which focus on the crowds, instead of the background in the images. The attention mechanism and the skip-connections can adjust the weights of feature maps while maintaining features at different scales. Extensive experiments on ShanghaiTech Part_A & B and UCF-QNRF dataset demonstrate that our network can achieve better performances with Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) values outperforming existing methodologies: ShanghaiTech Part_A (MAE/RMSE: 60.0/104.9), Part_B (MAE/RMSE: 7.8/13.8), and UCF-QNRF (MAE/RMSE: 98.6/179.7).
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1573-7721
1380-7501
1573-7721
DOI:10.1007/s11042-024-19055-5