CCST: crowd counting with swin transformer

Accurately estimating the number of individuals contained in an image is the purpose of the crowd counting. It has always faced two major difficulties: uneven distribution of crowd density and large span of head size. Focusing on the former, most CNN-based methods divide the image into multiple patc...

Full description

Saved in:
Bibliographic Details
Published inThe Visual computer Vol. 39; no. 7; pp. 2671 - 2682
Main Authors Li, Bo, Zhang, Yong, Xu, Haihui, Yin, Baocai
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.07.2023
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Accurately estimating the number of individuals contained in an image is the purpose of the crowd counting. It has always faced two major difficulties: uneven distribution of crowd density and large span of head size. Focusing on the former, most CNN-based methods divide the image into multiple patches for processing, ignoring the connection between the patches. For the latter, the multi-scale feature fusion method using feature pyramid ignores the matching relationship between the head size and the hierarchical features. In response to the above issues, we propose a crowd counting network named CCST based on swin transformer, and tailor a feature adaptive fusion regression head called FAFHead. Swin transformer can fully exchange information within and between patches, and effectively alleviate the problem of uneven distribution of crowd density. FAFHead can adaptively fuse multi-level features, improve the matching relationship between head size and feature pyramid hierarchy, and relief the problem of large span of head size available. Experimental results on common datasets show that CCST has better counting performance than all weakly supervised counting works and great majority of popular density map-based fully supervised works.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0178-2789
1432-2315
DOI:10.1007/s00371-022-02485-3