CSTrans: Correlation-guided Self-Activation Transformer for Counting Everything

Counting everything, also named few-shot counting, requires a model to be able to count objects with any novel (unseen) category giving few exemplar boxes. However, the existing few-shot counting methods are sub-optimal due to weak feature representation, such as the correlation between the exemplar...

Full description

Saved in:
Bibliographic Details
Published inPattern recognition Vol. 153; p. 110556
Main Authors Gao, Bin-Bin, Huang, Zhongyi
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.09.2024
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Counting everything, also named few-shot counting, requires a model to be able to count objects with any novel (unseen) category giving few exemplar boxes. However, the existing few-shot counting methods are sub-optimal due to weak feature representation, such as the correlation between the exemplar patch and query feature, and contextual dependencies in density map prediction. In this paper, we propose a very simple but effective method, CSTrans, consisting of a Correlation-guided Self-Activation (CSA) module and a Local Dependency Transformer (LDT) module, to mitigate the above two issues, respectively. The CSA utilizes the correlation map to activate the semantic features and suppress the noisy influence of the query features, aiming at mining the potential relation while enriching correlation representation. Furthermore, the LDT incorporates a Transformer to explore local contextual dependencies and predict the density map. Our method achieves competitive performance on FSC-147 and CARPK datasets. We hope its simple implementation and superior performance can serve as a new and strong baseline for few-shot counting tasks and attract more interest in designing simple but effective models in future studies. Our code for CSTrans is available at https://github.com/gaobb/CSTrans. •A simple but effective CSTrans framework for counting everything.•A correlation-guided self-activation module for enriching feature representation.•A local dependency transformer module for modeling local context dependency.•Excellent performance on two few-shot counting datasets, FSC-147and CARPK.
AbstractList Counting everything, also named few-shot counting, requires a model to be able to count objects with any novel (unseen) category giving few exemplar boxes. However, the existing few-shot counting methods are sub-optimal due to weak feature representation, such as the correlation between the exemplar patch and query feature, and contextual dependencies in density map prediction. In this paper, we propose a very simple but effective method, CSTrans, consisting of a Correlation-guided Self-Activation (CSA) module and a Local Dependency Transformer (LDT) module, to mitigate the above two issues, respectively. The CSA utilizes the correlation map to activate the semantic features and suppress the noisy influence of the query features, aiming at mining the potential relation while enriching correlation representation. Furthermore, the LDT incorporates a Transformer to explore local contextual dependencies and predict the density map. Our method achieves competitive performance on FSC-147 and CARPK datasets. We hope its simple implementation and superior performance can serve as a new and strong baseline for few-shot counting tasks and attract more interest in designing simple but effective models in future studies. Our code for CSTrans is available at https://github.com/gaobb/CSTrans. •A simple but effective CSTrans framework for counting everything.•A correlation-guided self-activation module for enriching feature representation.•A local dependency transformer module for modeling local context dependency.•Excellent performance on two few-shot counting datasets, FSC-147and CARPK.
ArticleNumber 110556
Author Gao, Bin-Bin
Huang, Zhongyi
Author_xml – sequence: 1
  givenname: Bin-Bin
  orcidid: 0000-0003-2572-8156
  surname: Gao
  fullname: Gao, Bin-Bin
  email: csgaobb@gmail.com
– sequence: 2
  givenname: Zhongyi
  surname: Huang
  fullname: Huang, Zhongyi
BookMark eNqFkEFuwjAQRa2KSgXaG3SRCyS149gmLCqhiNJKSCyga8vYY2oUYuQYJG7fQLrqol3NaOa_r5k_QoPGN4DQM8EZwYS_7LOjitrvshznRUYIZozfoSGZCJoyUuQDNMSYkpTmmD6gUdvuMSaiWwzRqlpvgmraaVL5EKBW0fkm3Z2cAZOsobbpTEd3vo2Tm9L6cICQdKVDTk10zS6ZnyFc4lfXPqJ7q-oWnn7qGH2-zTfVe7pcLT6q2TLVFPOYki1j2AgutrYoS0xFWcLElIQVjBLBqOLlhNgtaCYUA6t5wQTX2uRgOrkVdIymva8Ovm0DWKldvF0Zg3K1JFheo5F72Ucjr9HIPpoOLn7Bx-AOKlz-w157DLrHzg6CbLWDRoNxAXSUxru_Db4BuZGCRg
CitedBy_id crossref_primary_10_1007_s11760_024_03792_z
crossref_primary_10_1016_j_imavis_2024_105383
Cites_doi 10.1016/j.patcog.2021.108470
10.1109/ICCV.2019.00851
10.1109/CVPR42600.2020.00407
10.1109/CVPR42600.2020.00443
10.1007/978-3-030-20893-6_42
10.1109/CVPR46437.2021.00340
10.1109/TCSVT.2022.3179824
10.1109/ICCV48922.2021.00856
10.1016/j.patcog.2023.109830
10.1109/WACV48630.2021.00091
10.1109/TCSVT.2017.2656718
10.1109/WACV56688.2023.00625
10.1007/978-3-030-58452-8_13
10.1007/978-3-319-10602-1_48
10.1007/978-3-319-46487-9_48
10.1109/CVPR.2016.70
10.1109/ICCV.2017.324
10.1109/CVPR.2016.91
10.1109/ICCV48922.2021.00986
10.1016/j.neunet.2022.01.015
10.1038/s41592-018-0261-2
10.1109/TPAMI.2020.3013717
10.1109/CVPR52688.2022.00931
10.1109/CVPR.2016.90
10.1016/j.neunet.2021.10.010
10.1609/aaai.v35i3.16337
10.1016/j.patcog.2021.108484
10.1109/CVPR52729.2023.01492
10.1007/978-3-031-20044-1_20
10.24963/ijcai.2021/116
10.1109/ICCV.2017.446
10.1109/ICCV.2017.322
10.1109/CVPR.2017.660
10.1007/978-3-030-01234-2_17
10.1109/ICCV48922.2021.00061
10.1007/978-3-031-19818-2_42
10.1016/j.patcog.2023.109585
ContentType Journal Article
Copyright 2024 Elsevier Ltd
Copyright_xml – notice: 2024 Elsevier Ltd
DBID AAYXX
CITATION
DOI 10.1016/j.patcog.2024.110556
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1873-5142
ExternalDocumentID 10_1016_j_patcog_2024_110556
S0031320324003078
GrantInformation_xml – fundername: Tencent
GroupedDBID --K
--M
-D8
-DT
-~X
.DC
.~1
0R~
123
1B1
1RT
1~.
1~5
29O
4.4
457
4G.
53G
5VS
7-5
71M
8P~
9JN
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABEFU
ABFNM
ABFRF
ABHFT
ABJNI
ABMAC
ABTAH
ABXDB
ACBEA
ACDAQ
ACGFO
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
ADMXK
ADTZH
AEBSH
AECPX
AEFWE
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJOXV
AKRWK
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EJD
EO8
EO9
EP2
EP3
F0J
F5P
FD6
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
H~9
IHE
J1W
JJJVA
KOM
KZ1
LG9
LMP
LY1
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
RNS
ROL
RPZ
SBC
SDF
SDG
SDP
SDS
SES
SEW
SPC
SPCBC
SST
SSV
SSZ
T5K
TN5
UNMZH
VOH
WUQ
XJE
XPP
ZMT
ZY4
~G-
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AFXIZ
AGCQF
AGQPQ
AGRNS
AIGII
AIIUN
AKBMS
AKYEP
ANKPU
APXCP
BNPGV
CITATION
SSH
ID FETCH-LOGICAL-c306t-1b550d767bf49903799e8d9154531753a6981fbec57a5efc64576ccd2ed990f73
IEDL.DBID .~1
ISSN 0031-3203
IngestDate Thu Apr 24 22:54:48 EDT 2025
Tue Jul 01 02:36:47 EDT 2025
Sat Jun 01 15:41:37 EDT 2024
IsPeerReviewed true
IsScholarly true
Keywords Counting everything
Vision transformer
Local dependency
Few-shot counting
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c306t-1b550d767bf49903799e8d9154531753a6981fbec57a5efc64576ccd2ed990f73
ORCID 0000-0003-2572-8156
ParticipantIDs crossref_citationtrail_10_1016_j_patcog_2024_110556
crossref_primary_10_1016_j_patcog_2024_110556
elsevier_sciencedirect_doi_10_1016_j_patcog_2024_110556
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate September 2024
2024-09-00
PublicationDateYYYYMMDD 2024-09-01
PublicationDate_xml – month: 09
  year: 2024
  text: September 2024
PublicationDecade 2020
PublicationTitle Pattern recognition
PublicationYear 2024
Publisher Elsevier Ltd
Publisher_xml – name: Elsevier Ltd
References Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, B. Guo, Swin Transformer: Hierarchical vision transformer using shifted windows, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021.
W. Wang, E. Xie, X. Li, D.-P. Fan, K. Song, D. Liang, T. Lu, P. Luo, L. Shao, Pyramid Vision Transformer: A versatile backbone for dense prediction without convolutions, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021.
R. Hou, H. Chang, B. Ma, S. Shan, X. Chen, Cross attention network for few-shot classification, in: Proceedings of the Conference on Neural Information Processing Systems, 2019.
L. Chang, Z. Yujie, Z. Andrew, X. Weidi, CounTR: Transformer-based Generalised Visual Counting, in: Proceedings of the British Machine Vision Conference, 2022.
Ma, Dai, Jia, Sun, Tan, Liu (b9) 2023; 141
N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, S. Zagoruyko, End-to-end object detection with transformers, in: Proceedings of the European Conference on Computer Vision, 2020.
Y. Yang, G. Li, Z. Wu, L. Su, Q. Huang, N. Sebe, Reverse perspective network for perspective-aware object counting, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.
O. Vinyals, C. Blundell, T. Lillicrap, D. Wierstra, et al., Matching networks for one shot learning, in: Proceedings of the Conference on Neural Information Processing Systems, 2016.
T.N. Mundhenk, G. Konjevod, W.A. Sakla, K. Boakye, A large contextual dataset for classification, detection and counting of cars with deep learning, in: Proceedings of the European Conference on Computer Vision, 2016.
Q. Fan, W. Zhuo, C.-K. Tang, Y.-W. Tai, Few-shot object detection with attention-RPN and multi-relation detector, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.
H. Lin, X. Hong, Z. Ma, X. Wei, Y. Qiu, Y. Wang, Y. Gong, Direct measure matching for crowd counting, in: Proceedings of the International Joint Conferences on Artificial Intelligence, 2021.
H.-T. Nguyen, C.-W. Ngo, Terrace-based food counting and segmentation, in: Proceedings of the AAAI Conference on Artificial Intelligence, 2021.
M.-R. Hsieh, Y.-L. Lin, W.H. Hsu, Drone-based object counting by spatially regularized regional proposal network, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2017.
T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C.L. Zitnick, Microsoft coco: Common objects in context, in: Proceedings of the European Conference on Computer Vision, 2014.
Falk, Mai, Bensch, Çiçek, Abdulkadir, Marrakchi, Böhm, Deubner, Jäckel, Seiwald (b14) 2019; 16
V. Lempitsky, A. Zisserman, Learning to count objects in images, in: Proceedings of the Conference on Neural Information Processing Systems, 2010.
T.-Y. Lin, P. Goyal, R. Girshick, K. He, P. Dollár, Focal loss for dense object detection, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2017.
Rodriguez-Vazquez, Alvarez-Fernandez, Molina, Campoy (b15) 2022; 145
Tian, Zhao, Shu, Yang, Li, Jia (b21) 2020; 44
X. Gu, T.-Y. Lin, W. Kuo, Y. Cui, Open-vocabulary object detection via vision and language knowledge distillation, in: Proceedings of the International Conference on Learning Representations, 2022.
Y. Zhang, D. Zhou, S. Chen, S. Gao, Y. Ma, Single-image crowd counting via multi-column convolutional neural network, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2016.
K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2016.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is all you need, in: Proceedings of the Conference on Neural Information Processing Systems, 2017.
M. Shi, H. Lu, C. Feng, C. Liu, Z. Cao, Represent, compare, and learn: A similarity-aware framework for class-agnostic counting, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 9529–9538.
J. Xu, H. Le, V. Nguyen, V. Ranjan, D. Samaras, Zero-shot Object Counting, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al., An image is worth 16x16 words: Transformers for image recognition at scale, in: Proceedings of the International Conference on Learning Representations, 2021.
H. Zhao, J. Shi, X. Qi, X. Wang, J. Jia, Pyramid scene parsing network, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2017.
B.-B. Gao, X. Chen, Z. Huang, C. Nie, J. Liu, J. Lai, G. Jiang, X. Wang, C. Wang, Decoupling Classifier for Boosting Few-shot Object Detection and Instance Segmentation, in: Proceedings of the Conference on Neural Information Processing Systems, 35, 2022, pp. 18640–18652.
T. Nguyen, C. Pham, K. Nguyen, M. Hoai, Few-shot object counting and detection, in: Proceedings of the European Conference on Computer Vision, 2022, pp. 348–365.
Z. You, K. Yang, W. Luo, X. Lu, L. Cui, X. Le, Few-shot object counting with similarity-aware feature enhancement, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023, pp. 6315–6324.
S. Ren, K. He, R. Girshick, J. Sun, Faster R-CNN: Towards real-time object detection with region proposal networks, in: Proceedings of the Conference on Neural Information Processing Systems, 2015.
Delussu, Putzu, Fumera (b5) 2022; 124
Setti, Conigliaro, Tobanelli, Cristani (b16) 2017; 28
L. Qiao, Y. Zhao, Z. Li, X. Qiu, J. Wu, C. Zhang, DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021.
V. Ranjan, U. Sharma, T. Nguyen, M. Hoai, Learning to count everything, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021.
Zhang, Xu, Luo, Cao, Zhen (b7) 2022; 32
Nguyen, Ngo, Chan (b13) 2022; 124
K. He, G. Gkioxari, P. Dollár, R. Girshick, Mask R-CNN, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2017.
V. Ranjan, H. Le, M. Hoai, Iterative crowd counting, in: Proceedings of the European Conference on Computer Vision, 2018.
Radford, Kim, Hallacy, Ramesh, Goh, Agarwal, Sastry, Askell, Mishkin, Clark (b39) 2021
J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: Unified, real-time object detection, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2016.
B. Kang, Z. Liu, X. Wang, F. Yu, J. Feng, T. Darrell, Few-shot object detection via feature reweighting, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.
C. Finn, P. Abbeel, S. Levine, Model-agnostic meta-learning for fast adaptation of deep networks, in: Proceedings of the International Conference on Machine Learning, 2017.
Chen, Yang, Zhang, Zhang, Chen, Du (b6) 2022; 148
E. Lu, W. Xie, A. Zisserman, Class-agnostic counting, in: Proceedings of Asian Conference on Computer Vision, 2018.
W. Lin, K. Yang, X. Ma, J. Gao, L. Liu, S. Liu, J. Hou, S. Yi, A.B. Chan, Scale-Prior Deformable Convolution for Exemplar-Guided Class-Agnostic Counting, in: Proceedings of the British Machine Vision Conference, 2022.
Wang, Zhou, Cai, Gong (b8) 2023
S.-D. Yang, H.-T. Su, W.H. Hsu, W.-C. Chen, Class-agnostic few-shot object counting, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021.
M. Xu, Z. Zhang, F. Wei, Y. Lin, Y. Cao, H. Hu, X. Bai, A simple baseline for zero-shot semantic segmentation with pre-trained vision-language model, in: Proceedings of the European Conference on Computer Vision, 2022.
Wang (10.1016/j.patcog.2024.110556_b8) 2023
10.1016/j.patcog.2024.110556_b45
10.1016/j.patcog.2024.110556_b46
10.1016/j.patcog.2024.110556_b43
10.1016/j.patcog.2024.110556_b44
10.1016/j.patcog.2024.110556_b49
10.1016/j.patcog.2024.110556_b47
10.1016/j.patcog.2024.110556_b48
Setti (10.1016/j.patcog.2024.110556_b16) 2017; 28
Delussu (10.1016/j.patcog.2024.110556_b5) 2022; 124
10.1016/j.patcog.2024.110556_b12
10.1016/j.patcog.2024.110556_b10
10.1016/j.patcog.2024.110556_b11
10.1016/j.patcog.2024.110556_b17
10.1016/j.patcog.2024.110556_b18
10.1016/j.patcog.2024.110556_b19
Rodriguez-Vazquez (10.1016/j.patcog.2024.110556_b15) 2022; 145
Ma (10.1016/j.patcog.2024.110556_b9) 2023; 141
Falk (10.1016/j.patcog.2024.110556_b14) 2019; 16
10.1016/j.patcog.2024.110556_b20
10.1016/j.patcog.2024.110556_b3
10.1016/j.patcog.2024.110556_b23
10.1016/j.patcog.2024.110556_b4
10.1016/j.patcog.2024.110556_b24
10.1016/j.patcog.2024.110556_b1
10.1016/j.patcog.2024.110556_b2
10.1016/j.patcog.2024.110556_b22
10.1016/j.patcog.2024.110556_b27
10.1016/j.patcog.2024.110556_b28
10.1016/j.patcog.2024.110556_b25
10.1016/j.patcog.2024.110556_b26
10.1016/j.patcog.2024.110556_b29
Nguyen (10.1016/j.patcog.2024.110556_b13) 2022; 124
Radford (10.1016/j.patcog.2024.110556_b39) 2021
10.1016/j.patcog.2024.110556_b30
10.1016/j.patcog.2024.110556_b31
Tian (10.1016/j.patcog.2024.110556_b21) 2020; 44
10.1016/j.patcog.2024.110556_b34
10.1016/j.patcog.2024.110556_b35
10.1016/j.patcog.2024.110556_b32
10.1016/j.patcog.2024.110556_b33
10.1016/j.patcog.2024.110556_b38
Zhang (10.1016/j.patcog.2024.110556_b7) 2022; 32
10.1016/j.patcog.2024.110556_b36
10.1016/j.patcog.2024.110556_b37
10.1016/j.patcog.2024.110556_b41
Chen (10.1016/j.patcog.2024.110556_b6) 2022; 148
10.1016/j.patcog.2024.110556_b42
10.1016/j.patcog.2024.110556_b40
References_xml – reference: N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, S. Zagoruyko, End-to-end object detection with transformers, in: Proceedings of the European Conference on Computer Vision, 2020.
– reference: K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2016.
– reference: S. Ren, K. He, R. Girshick, J. Sun, Faster R-CNN: Towards real-time object detection with region proposal networks, in: Proceedings of the Conference on Neural Information Processing Systems, 2015.
– reference: T.-Y. Lin, P. Goyal, R. Girshick, K. He, P. Dollár, Focal loss for dense object detection, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2017.
– reference: Y. Zhang, D. Zhou, S. Chen, S. Gao, Y. Ma, Single-image crowd counting via multi-column convolutional neural network, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2016.
– reference: R. Hou, H. Chang, B. Ma, S. Shan, X. Chen, Cross attention network for few-shot classification, in: Proceedings of the Conference on Neural Information Processing Systems, 2019.
– reference: S.-D. Yang, H.-T. Su, W.H. Hsu, W.-C. Chen, Class-agnostic few-shot object counting, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021.
– reference: A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is all you need, in: Proceedings of the Conference on Neural Information Processing Systems, 2017.
– reference: H.-T. Nguyen, C.-W. Ngo, Terrace-based food counting and segmentation, in: Proceedings of the AAAI Conference on Artificial Intelligence, 2021.
– reference: W. Wang, E. Xie, X. Li, D.-P. Fan, K. Song, D. Liang, T. Lu, P. Luo, L. Shao, Pyramid Vision Transformer: A versatile backbone for dense prediction without convolutions, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021.
– reference: C. Finn, P. Abbeel, S. Levine, Model-agnostic meta-learning for fast adaptation of deep networks, in: Proceedings of the International Conference on Machine Learning, 2017.
– reference: W. Lin, K. Yang, X. Ma, J. Gao, L. Liu, S. Liu, J. Hou, S. Yi, A.B. Chan, Scale-Prior Deformable Convolution for Exemplar-Guided Class-Agnostic Counting, in: Proceedings of the British Machine Vision Conference, 2022.
– volume: 16
  start-page: 67
  year: 2019
  end-page: 70
  ident: b14
  article-title: U-net: deep learning for cell counting, detection, and morphometry
  publication-title: Nat. Methods
– reference: T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C.L. Zitnick, Microsoft coco: Common objects in context, in: Proceedings of the European Conference on Computer Vision, 2014.
– reference: M. Shi, H. Lu, C. Feng, C. Liu, Z. Cao, Represent, compare, and learn: A similarity-aware framework for class-agnostic counting, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 9529–9538.
– reference: Z. You, K. Yang, W. Luo, X. Lu, L. Cui, X. Le, Few-shot object counting with similarity-aware feature enhancement, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023, pp. 6315–6324.
– reference: V. Lempitsky, A. Zisserman, Learning to count objects in images, in: Proceedings of the Conference on Neural Information Processing Systems, 2010.
– reference: V. Ranjan, U. Sharma, T. Nguyen, M. Hoai, Learning to count everything, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021.
– volume: 44
  start-page: 1050
  year: 2020
  end-page: 1065
  ident: b21
  article-title: Prior guided feature enrichment network for few-shot segmentation
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
– volume: 124
  year: 2022
  ident: b5
  article-title: Scene-specific crowd counting using synthetic training images
  publication-title: Pattern Recognit.
– reference: O. Vinyals, C. Blundell, T. Lillicrap, D. Wierstra, et al., Matching networks for one shot learning, in: Proceedings of the Conference on Neural Information Processing Systems, 2016.
– reference: Q. Fan, W. Zhuo, C.-K. Tang, Y.-W. Tai, Few-shot object detection with attention-RPN and multi-relation detector, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.
– reference: L. Qiao, Y. Zhao, Z. Li, X. Qiu, J. Wu, C. Zhang, DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021.
– reference: K. He, G. Gkioxari, P. Dollár, R. Girshick, Mask R-CNN, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2017.
– start-page: 8748
  year: 2021
  end-page: 8763
  ident: b39
  article-title: Learning transferable visual models from natural language supervision
  publication-title: Proceedings of the International Conference on Machine Learning
– year: 2023
  ident: b8
  article-title: Crowdmlp: Weakly-supervised crowd counting via multi-granularity mlp
  publication-title: Pattern Recognit.
– reference: A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al., An image is worth 16x16 words: Transformers for image recognition at scale, in: Proceedings of the International Conference on Learning Representations, 2021.
– reference: H. Zhao, J. Shi, X. Qi, X. Wang, J. Jia, Pyramid scene parsing network, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2017.
– volume: 28
  start-page: 1798
  year: 2017
  end-page: 1806
  ident: b16
  article-title: Count on me: learning to count on a single image
  publication-title: IEEE Trans. Circuits Syst. Video Technol.
– reference: X. Gu, T.-Y. Lin, W. Kuo, Y. Cui, Open-vocabulary object detection via vision and language knowledge distillation, in: Proceedings of the International Conference on Learning Representations, 2022.
– reference: T. Nguyen, C. Pham, K. Nguyen, M. Hoai, Few-shot object counting and detection, in: Proceedings of the European Conference on Computer Vision, 2022, pp. 348–365.
– volume: 32
  start-page: 6686
  year: 2022
  end-page: 6699
  ident: b7
  article-title: Cross-domain attention network for unsupervised domain adaptation crowd counting
  publication-title: IEEE Trans. Circuits Syst. Video Technol.
– reference: L. Chang, Z. Yujie, Z. Andrew, X. Weidi, CounTR: Transformer-based Generalised Visual Counting, in: Proceedings of the British Machine Vision Conference, 2022.
– volume: 148
  start-page: 219
  year: 2022
  end-page: 231
  ident: b6
  article-title: Region-aware network: Model human’s top-down visual perception mechanism for crowd counting
  publication-title: Neural Netw.
– reference: V. Ranjan, H. Le, M. Hoai, Iterative crowd counting, in: Proceedings of the European Conference on Computer Vision, 2018.
– reference: J. Xu, H. Le, V. Nguyen, V. Ranjan, D. Samaras, Zero-shot Object Counting, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
– reference: J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: Unified, real-time object detection, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2016.
– reference: Y. Yang, G. Li, Z. Wu, L. Su, Q. Huang, N. Sebe, Reverse perspective network for perspective-aware object counting, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.
– volume: 145
  start-page: 155
  year: 2022
  end-page: 163
  ident: b15
  article-title: Zenithal isotropic object counting by localization using adversarial training
  publication-title: Neural Netw.
– reference: B.-B. Gao, X. Chen, Z. Huang, C. Nie, J. Liu, J. Lai, G. Jiang, X. Wang, C. Wang, Decoupling Classifier for Boosting Few-shot Object Detection and Instance Segmentation, in: Proceedings of the Conference on Neural Information Processing Systems, 35, 2022, pp. 18640–18652.
– reference: M. Xu, Z. Zhang, F. Wei, Y. Lin, Y. Cao, H. Hu, X. Bai, A simple baseline for zero-shot semantic segmentation with pre-trained vision-language model, in: Proceedings of the European Conference on Computer Vision, 2022.
– reference: T.N. Mundhenk, G. Konjevod, W.A. Sakla, K. Boakye, A large contextual dataset for classification, detection and counting of cars with deep learning, in: Proceedings of the European Conference on Computer Vision, 2016.
– volume: 124
  year: 2022
  ident: b13
  article-title: SibNet: Food instance counting and segmentation
  publication-title: Pattern Recognit.
– reference: Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, B. Guo, Swin Transformer: Hierarchical vision transformer using shifted windows, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021.
– reference: B. Kang, Z. Liu, X. Wang, F. Yu, J. Feng, T. Darrell, Few-shot object detection via feature reweighting, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.
– reference: E. Lu, W. Xie, A. Zisserman, Class-agnostic counting, in: Proceedings of Asian Conference on Computer Vision, 2018.
– reference: M.-R. Hsieh, Y.-L. Lin, W.H. Hsu, Drone-based object counting by spatially regularized regional proposal network, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2017.
– volume: 141
  year: 2023
  ident: b9
  article-title: Crowd counting from single images using recursive multi-pathway zooming and foreground enhancement
  publication-title: Pattern Recognit.
– reference: H. Lin, X. Hong, Z. Ma, X. Wei, Y. Qiu, Y. Wang, Y. Gong, Direct measure matching for crowd counting, in: Proceedings of the International Joint Conferences on Artificial Intelligence, 2021.
– volume: 124
  year: 2022
  ident: 10.1016/j.patcog.2024.110556_b13
  article-title: SibNet: Food instance counting and segmentation
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2021.108470
– ident: 10.1016/j.patcog.2024.110556_b33
  doi: 10.1109/ICCV.2019.00851
– ident: 10.1016/j.patcog.2024.110556_b37
– ident: 10.1016/j.patcog.2024.110556_b32
  doi: 10.1109/CVPR42600.2020.00407
– ident: 10.1016/j.patcog.2024.110556_b11
  doi: 10.1109/CVPR42600.2020.00443
– ident: 10.1016/j.patcog.2024.110556_b17
  doi: 10.1007/978-3-030-20893-6_42
– ident: 10.1016/j.patcog.2024.110556_b19
  doi: 10.1109/CVPR46437.2021.00340
– ident: 10.1016/j.patcog.2024.110556_b30
– volume: 32
  start-page: 6686
  issue: 10
  year: 2022
  ident: 10.1016/j.patcog.2024.110556_b7
  article-title: Cross-domain attention network for unsupervised domain adaptation crowd counting
  publication-title: IEEE Trans. Circuits Syst. Video Technol.
  doi: 10.1109/TCSVT.2022.3179824
– ident: 10.1016/j.patcog.2024.110556_b22
  doi: 10.1109/ICCV48922.2021.00856
– ident: 10.1016/j.patcog.2024.110556_b20
– year: 2023
  ident: 10.1016/j.patcog.2024.110556_b8
  article-title: Crowdmlp: Weakly-supervised crowd counting via multi-granularity mlp
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2023.109830
– ident: 10.1016/j.patcog.2024.110556_b18
  doi: 10.1109/WACV48630.2021.00091
– ident: 10.1016/j.patcog.2024.110556_b44
– volume: 28
  start-page: 1798
  issue: 8
  year: 2017
  ident: 10.1016/j.patcog.2024.110556_b16
  article-title: Count on me: learning to count on a single image
  publication-title: IEEE Trans. Circuits Syst. Video Technol.
  doi: 10.1109/TCSVT.2017.2656718
– ident: 10.1016/j.patcog.2024.110556_b1
– ident: 10.1016/j.patcog.2024.110556_b34
– ident: 10.1016/j.patcog.2024.110556_b23
  doi: 10.1109/WACV56688.2023.00625
– ident: 10.1016/j.patcog.2024.110556_b26
  doi: 10.1007/978-3-030-58452-8_13
– ident: 10.1016/j.patcog.2024.110556_b47
  doi: 10.1007/978-3-319-10602-1_48
– start-page: 8748
  year: 2021
  ident: 10.1016/j.patcog.2024.110556_b39
  article-title: Learning transferable visual models from natural language supervision
– ident: 10.1016/j.patcog.2024.110556_b49
  doi: 10.1007/978-3-319-46487-9_48
– ident: 10.1016/j.patcog.2024.110556_b27
– ident: 10.1016/j.patcog.2024.110556_b2
  doi: 10.1109/CVPR.2016.70
– ident: 10.1016/j.patcog.2024.110556_b45
  doi: 10.1109/ICCV.2017.324
– ident: 10.1016/j.patcog.2024.110556_b48
  doi: 10.1109/CVPR.2016.91
– ident: 10.1016/j.patcog.2024.110556_b40
– ident: 10.1016/j.patcog.2024.110556_b28
  doi: 10.1109/ICCV48922.2021.00986
– volume: 148
  start-page: 219
  year: 2022
  ident: 10.1016/j.patcog.2024.110556_b6
  article-title: Region-aware network: Model human’s top-down visual perception mechanism for crowd counting
  publication-title: Neural Netw.
  doi: 10.1016/j.neunet.2022.01.015
– volume: 16
  start-page: 67
  issue: 1
  year: 2019
  ident: 10.1016/j.patcog.2024.110556_b14
  article-title: U-net: deep learning for cell counting, detection, and morphometry
  publication-title: Nat. Methods
  doi: 10.1038/s41592-018-0261-2
– volume: 44
  start-page: 1050
  issue: 2
  year: 2020
  ident: 10.1016/j.patcog.2024.110556_b21
  article-title: Prior guided feature enrichment network for few-shot segmentation
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
  doi: 10.1109/TPAMI.2020.3013717
– ident: 10.1016/j.patcog.2024.110556_b24
  doi: 10.1109/CVPR52688.2022.00931
– ident: 10.1016/j.patcog.2024.110556_b43
  doi: 10.1109/CVPR.2016.90
– volume: 145
  start-page: 155
  year: 2022
  ident: 10.1016/j.patcog.2024.110556_b15
  article-title: Zenithal isotropic object counting by localization using adversarial training
  publication-title: Neural Netw.
  doi: 10.1016/j.neunet.2021.10.010
– ident: 10.1016/j.patcog.2024.110556_b12
  doi: 10.1609/aaai.v35i3.16337
– volume: 124
  year: 2022
  ident: 10.1016/j.patcog.2024.110556_b5
  article-title: Scene-specific crowd counting using synthetic training images
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2021.108484
– ident: 10.1016/j.patcog.2024.110556_b38
  doi: 10.1109/CVPR52729.2023.01492
– ident: 10.1016/j.patcog.2024.110556_b35
  doi: 10.1007/978-3-031-20044-1_20
– ident: 10.1016/j.patcog.2024.110556_b4
  doi: 10.24963/ijcai.2021/116
– ident: 10.1016/j.patcog.2024.110556_b10
  doi: 10.1109/ICCV.2017.446
– ident: 10.1016/j.patcog.2024.110556_b36
– ident: 10.1016/j.patcog.2024.110556_b46
  doi: 10.1109/ICCV.2017.322
– ident: 10.1016/j.patcog.2024.110556_b42
  doi: 10.1109/CVPR.2017.660
– ident: 10.1016/j.patcog.2024.110556_b3
  doi: 10.1007/978-3-030-01234-2_17
– ident: 10.1016/j.patcog.2024.110556_b31
– ident: 10.1016/j.patcog.2024.110556_b25
– ident: 10.1016/j.patcog.2024.110556_b29
  doi: 10.1109/ICCV48922.2021.00061
– ident: 10.1016/j.patcog.2024.110556_b41
  doi: 10.1007/978-3-031-19818-2_42
– volume: 141
  year: 2023
  ident: 10.1016/j.patcog.2024.110556_b9
  article-title: Crowd counting from single images using recursive multi-pathway zooming and foreground enhancement
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2023.109585
SSID ssj0017142
Score 2.4757323
Snippet Counting everything, also named few-shot counting, requires a model to be able to count objects with any novel (unseen) category giving few exemplar boxes....
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 110556
SubjectTerms Counting everything
Few-shot counting
Local dependency
Vision transformer
Title CSTrans: Correlation-guided Self-Activation Transformer for Counting Everything
URI https://dx.doi.org/10.1016/j.patcog.2024.110556
Volume 153
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEF5KvXjxLdZH2YPXtUl2k228ldBSFeuhLfQWNvsoldIWSQ9e_O3uJLtFQRQ8hYSZECaTeYRv5kPoNjWBEYKnhAspCYsLRkQkFOE210UmDjmvBmmfR8lwyh5n8ayBMj8LA7BKF_vrmF5Fa3el46zZ2SwWMOMLawcD2CgHngoDv4xx8PK7jx3MA_i9643hNCQg7cfnKozXxoa79dx2iREDPHwMNNY_pacvKWdwhA5crYh79eMco4ZenaBDz8OA3Wd5il6ycZVy7nEGXBs1uo3MtwulFR7rpSE96VnM8MRXqvYO9oAzRxaB-9ap30v4IXWGpoP-JBsSR5RApK34SxIWts9QPOGFsQ1MQHma6q5KoTqC8oCKJO2Gxr6tmItYG5kw22VIqSKtrLjh9Bw1V-uVvkCYRiJMEsOKkEqmIlFQmcYpU2ESQOo3LUS9fXLptogDmcUy93Cx17y2ag5WzWurthDZaW3qLRp_yHNv-vybN-Q20P-qeflvzSu0D2c1fuwaNcu3rb6xBUdZtCuPaqO93sPTcPQJBtTUbA
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LS8NAEB58HPTiW6zPPehxtXluI3iQqrTWx8EK3uJmH6VS2qIt4sU_5R90JtmIgigIngLJbshOJjPfhm_mA9hNbNVKKRIupFI8jLKQS19qLjDX-TbyhMgLaS-v4sZteH4X3U3AW1kLQ7RKF_uLmJ5Ha3fmwFnzYNjtUo0vtR2sUkc58tSaY1a2zMsz7tuejpon-JL3fP_stF1vcCctwBVi5BH3MkTmWsQiswj5q4FIElPTCeEJSqiBjJOaZ3F9kZCRsSoOEZcrpX2jcbgVAd53EqZxdI1kE_ZfP3glJChetCgPPE6PV9br5aSyIcbXQQe3pX5IBPyIdLO_y4efctzZAsw5cMqOi_UvwoTpL8F8KfzAXBxYhuv6TZ7jDlmdxD0KOh3vjLvaaHZjepYfq1I2jbVLaIx3wAOrO3UKdopf0cuI_oCtwO2_mG8VpvqDvlkDFvjSi2MbZl6gQu3LLFBJlITai6uENWwFgtI-qXJty0k9o5eW_LSHtLBqSlZNC6tWgH_MGhZtO34ZL0rTp1_cL8XM8uPM9T_P3IGZRvvyIr1oXrU2YJauFOS1TZgaPY7NFqKdUbadexeD-_9253deSg8Z
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=CSTrans%3A+Correlation-guided+Self-Activation+Transformer+for+Counting+Everything&rft.jtitle=Pattern+recognition&rft.au=Gao%2C+Bin-Bin&rft.au=Huang%2C+Zhongyi&rft.date=2024-09-01&rft.pub=Elsevier+Ltd&rft.issn=0031-3203&rft.eissn=1873-5142&rft.volume=153&rft_id=info:doi/10.1016%2Fj.patcog.2024.110556&rft.externalDocID=S0031320324003078
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0031-3203&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0031-3203&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0031-3203&client=summon