ECSNet: Spatio-Temporal Feature Learning for Event Camera

The neuromorphic event cameras can efficiently sense the latent geometric structures and motion clues of a scene by generating asynchronous and sparse event signals. Due to the irregular layout of the event signals, how to leverage their plentiful spatio-temporal information for recognition tasks re...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on circuits and systems for video technology Vol. 33; no. 2; pp. 701 - 712
Main Authors Chen, Zhiwen, Wu, Jinjian, Hou, Junhui, Li, Leida, Dong, Weisheng, Shi, Guangming
Format Journal Article
LanguageEnglish
Published New York IEEE 01.02.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
Abstract The neuromorphic event cameras can efficiently sense the latent geometric structures and motion clues of a scene by generating asynchronous and sparse event signals. Due to the irregular layout of the event signals, how to leverage their plentiful spatio-temporal information for recognition tasks remains a significant challenge. Existing methods tend to treat events as dense image-like or point-serie representations. However, they either suffer from severe destruction on the sparsity of event data or fail to encode robust spatial cues. To fully exploit their inherent sparsity with reconciling the spatio-temporal information, we introduce a compact event representation, namely 2D-1T event cloud sequence (2D-1T ECS). We couple this representation with a novel light-weight spatio-temporal learning framework (ECSNet) that accommodates both object classification and action recognition tasks. The core of our framework is a hierarchical spatial relation module. Equipped with specially designed surface-event-based sampling unit and local event normalization unit to enhance the inter-event relation encoding, this module learns robust geometric features from the 2D event clouds. And we propose a motion attention module for efficiently capturing long-term temporal context evolving with the 1T cloud sequence. Empirically, the experiments show that our framework achieves par or even better state-of-the-art performance. Importantly, our approach cooperates well with the sparsity of event data without any sophisticated operations, hence leading to low computational costs and prominent inference speeds.
AbstractList The neuromorphic event cameras can efficiently sense the latent geometric structures and motion clues of a scene by generating asynchronous and sparse event signals. Due to the irregular layout of the event signals, how to leverage their plentiful spatio-temporal information for recognition tasks remains a significant challenge. Existing methods tend to treat events as dense image-like or point-serie representations. However, they either suffer from severe destruction on the sparsity of event data or fail to encode robust spatial cues. To fully exploit their inherent sparsity with reconciling the spatio-temporal information, we introduce a compact event representation, namely 2D-1T event cloud sequence (2D-1T ECS). We couple this representation with a novel light-weight spatio-temporal learning framework (ECSNet) that accommodates both object classification and action recognition tasks. The core of our framework is a hierarchical spatial relation module. Equipped with specially designed surface-event-based sampling unit and local event normalization unit to enhance the inter-event relation encoding, this module learns robust geometric features from the 2D event clouds. And we propose a motion attention module for efficiently capturing long-term temporal context evolving with the 1T cloud sequence. Empirically, the experiments show that our framework achieves par or even better state-of-the-art performance. Importantly, our approach cooperates well with the sparsity of event data without any sophisticated operations, hence leading to low computational costs and prominent inference speeds.
Author Shi, Guangming
Wu, Jinjian
Dong, Weisheng
Li, Leida
Hou, Junhui
Chen, Zhiwen
Author_xml – sequence: 1
  givenname: Zhiwen
  surname: Chen
  fullname: Chen, Zhiwen
  organization: School of Artificial Intelligence, Xidian University, Xi'an, China
– sequence: 2
  givenname: Jinjian
  orcidid: 0000-0001-7501-0009
  surname: Wu
  fullname: Wu, Jinjian
  email: jinjian.wu@mail.xidian.edu.cn
  organization: School of Artificial Intelligence, Xidian University, Xi'an, China
– sequence: 3
  givenname: Junhui
  orcidid: 0000-0003-3431-2021
  surname: Hou
  fullname: Hou, Junhui
  email: jh.hou@cityu.edu.hk
  organization: Department of Computer Science, City University of Hong Kong, Hong Kong, China
– sequence: 4
  givenname: Leida
  orcidid: 0000-0001-9069-8796
  surname: Li
  fullname: Li, Leida
  organization: School of Artificial Intelligence, Xidian University, Xi'an, China
– sequence: 5
  givenname: Weisheng
  orcidid: 0000-0002-9632-985X
  surname: Dong
  fullname: Dong, Weisheng
  organization: School of Artificial Intelligence, Xidian University, Xi'an, China
– sequence: 6
  givenname: Guangming
  orcidid: 0000-0003-2179-3292
  surname: Shi
  fullname: Shi, Guangming
  organization: School of Artificial Intelligence, Xidian University, Xi'an, China
BookMark eNp9kE1PwzAMhiM0JLbBH4BLJc4dSdp8cUPVBkgTHFa4RmnmoE5bWtIMiX9PxhAHDlxsS_Zjv34naOQ7DwhdEjwjBKubulq91jOKKZ0VKXKmTtCYMCZzSjEbpRozkktK2BmaDMMGY1LKUoyRmlerJ4i32ao3se3yGnZ9F8w2W4CJ-wDZEkzwrX_LXBey-Qf4mFVmB8Gco1NntgNc_OQpelnM6-ohXz7fP1Z3y9xSxWIurJUlrIFwSYzBIkkUpbTS2aZpUsNIcFiCFIQ4vObM8sZKjpuklTpMcDFF18e9feje9zBEven2waeTmgpRlKVQiqQpeZyyoRuGAE7bNh4-8jGYdqsJ1gej9LdR-mCU_jEqofQP2od2Z8Ln_9DVEWoB4BdQkivOePEFBCt03w
CODEN ITCTEM
CitedBy_id crossref_primary_10_1109_TCSVT_2024_3482436
crossref_primary_10_1016_j_jmsy_2024_09_013
crossref_primary_10_1016_j_neunet_2024_106493
crossref_primary_10_1109_JAS_2024_124470
crossref_primary_10_1109_TCSVT_2023_3326294
crossref_primary_10_1109_TCSVT_2023_3272375
crossref_primary_10_1109_TCSVT_2023_3317976
crossref_primary_10_1109_TCSVT_2024_3495769
crossref_primary_10_1109_TCSVT_2023_3249195
crossref_primary_10_1007_s10489_024_05982_1
crossref_primary_10_1109_TIFS_2024_3409167
crossref_primary_10_1016_j_neucom_2025_129776
crossref_primary_10_1109_TCSVT_2024_3448615
crossref_primary_10_1109_TCSVT_2023_3301176
crossref_primary_10_1109_JSEN_2024_3524301
crossref_primary_10_1016_j_eswa_2024_126255
Cites_doi 10.1109/CVPR.2019.00398
10.1109/CVPR.2018.00186
10.1109/ICASSP.2019.8683606
10.1109/WACV51458.2022.00073
10.1109/CVPR.2018.00685
10.1109/TMM.2020.2965434
10.1109/TPAMI.2020.3008413
10.1109/TCSVT.2018.2841516
10.1109/CVPR.2019.00108
10.1109/ICCV.2019.00573
10.1109/TC.2021.3119180
10.1109/TPAMI.2019.2919301
10.1109/CVPR.2019.00344
10.1109/TIP.2020.3023597
10.1007/978-3-030-58565-5_9
10.1109/CVPR.2018.00568
10.1109/CVPR.2016.90
10.1109/JSSC.2010.2085952
10.1109/jssc.2007.914337
10.1109/LRA.2020.3002480
10.3389/fnins.2015.00437
10.1109/tcsvt.2021.3073673
10.3389/fnins.2017.00309
10.1109/CVPR42600.2020.01112
10.1109/CVPR42600.2020.00580
10.3389/fnins.2015.00481
10.1109/CVPR.2019.00401
10.1109/ICCV.2019.00058
10.1109/ICACI49185.2020.9177628
10.1109/ICECS.2018.8617982
10.1109/CVPR.2017.502
10.1109/tpami.2022.3161735
10.1109/TPAMI.2015.2392947
10.1109/WACV.2019.00199
10.1109/CVPR.2017.781
10.1109/ICCV.2015.510
10.1109/CVPR.2018.00675
10.1007/s11263-014-0788-3
10.1109/ISCAS45731.2020.9181247
10.1109/TPAMI.2016.2574707
10.1109/tcsvt.2022.3156653
10.1109/TIP.2021.3077136
10.1109/TCSVT.2020.3044287
10.1109/CVPR46437.2021.01398
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TCSVT.2022.3202659
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Xplore
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1558-2205
EndPage 712
ExternalDocumentID 10_1109_TCSVT_2022_3202659
9869656
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 62022063
  funderid: 10.13039/501100001809
GroupedDBID -~X
0R~
29I
4.4
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
HZ~
H~9
ICLAB
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
O9-
OCL
P2P
RIA
RIE
RNS
RXW
TAE
TN5
VH1
AAYXX
CITATION
RIG
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c295t-7cc84ede1681aa07022748c8fcbbb4eda8ef08e8711f0d65c6bc860b8212f0103
IEDL.DBID RIE
ISSN 1051-8215
IngestDate Mon Jun 30 03:45:05 EDT 2025
Tue Jul 01 00:41:18 EDT 2025
Thu Apr 24 22:57:33 EDT 2025
Wed Aug 27 02:48:19 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 2
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c295t-7cc84ede1681aa07022748c8fcbbb4eda8ef08e8711f0d65c6bc860b8212f0103
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-9069-8796
0000-0003-2179-3292
0000-0001-7501-0009
0000-0003-3431-2021
0000-0002-9632-985X
PQID 2773447991
PQPubID 85433
PageCount 12
ParticipantIDs proquest_journals_2773447991
crossref_primary_10_1109_TCSVT_2022_3202659
crossref_citationtrail_10_1109_TCSVT_2022_3202659
ieee_primary_9869656
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2023-02-01
PublicationDateYYYYMMDD 2023-02-01
PublicationDate_xml – month: 02
  year: 2023
  text: 2023-02-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on circuits and systems for video technology
PublicationTitleAbbrev TCSVT
PublicationYear 2023
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref15
ref14
Vaswani (ref40); 30
ref11
ref10
ref17
ref16
ref19
ref18
ref51
ref50
ref46
ref45
ref48
ref47
ref42
Qi (ref31); 30
ref44
ref43
ref49
ref8
ref7
ref9
ref4
ref3
ref6
ref5
Lei Ba (ref34) 2016
ref35
ref37
ref30
ref32
Neil (ref12)
ref2
ref1
ref39
ref38
Dosovitskiy (ref41) 2020
Ioffe (ref33)
ref24
ref23
ref26
ref25
ref20
ref22
ref21
ref28
ref27
ref29
Bertasius (ref36); 2
References_xml – ident: ref19
  doi: 10.1109/CVPR.2019.00398
– ident: ref26
  doi: 10.1109/CVPR.2018.00186
– volume: 2
  start-page: 4
  volume-title: Proc. ICML
  ident: ref36
  article-title: Is space-time attention all you need for video understanding?
– ident: ref18
  doi: 10.1109/ICASSP.2019.8683606
– ident: ref38
  doi: 10.1109/WACV51458.2022.00073
– ident: ref48
  doi: 10.1109/CVPR.2018.00685
– ident: ref35
  doi: 10.1109/TMM.2020.2965434
– ident: ref6
  doi: 10.1109/TPAMI.2020.3008413
– ident: ref3
  doi: 10.1109/TCSVT.2018.2841516
– ident: ref17
  doi: 10.1109/CVPR.2019.00108
– ident: ref8
  doi: 10.1109/ICCV.2019.00573
– start-page: 1
  volume-title: Proc. NIPS
  ident: ref12
  article-title: Phased LSTM: Accelerating recurrent network training for long or event-based sequences
– ident: ref24
  doi: 10.1109/TC.2021.3119180
– ident: ref27
  doi: 10.1109/TPAMI.2019.2919301
– ident: ref29
  doi: 10.1109/CVPR.2019.00344
– ident: ref16
  doi: 10.1109/TIP.2020.3023597
– ident: ref9
  doi: 10.1007/978-3-030-58565-5_9
– ident: ref7
  doi: 10.1109/CVPR.2018.00568
– start-page: 448
  volume-title: Proc. Int. Conf. Mach. Learn.
  ident: ref33
  article-title: Batch normalization: Accelerating deep network training by reducing internal covariate shift
– ident: ref51
  doi: 10.1109/CVPR.2016.90
– ident: ref2
  doi: 10.1109/JSSC.2010.2085952
– volume: 30
  start-page: 1
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref31
  article-title: PointNet++: Deep hierarchical feature learning on point sets in a metric space
– ident: ref1
  doi: 10.1109/jssc.2007.914337
– ident: ref20
  doi: 10.1109/LRA.2020.3002480
– ident: ref42
  doi: 10.3389/fnins.2015.00437
– ident: ref21
  doi: 10.1109/tcsvt.2021.3073673
– year: 2020
  ident: ref41
  article-title: An image is worth 16×16 words: Transformers for image recognition at scale
  publication-title: arXiv:2010.11929
– ident: ref44
  doi: 10.3389/fnins.2017.00309
– ident: ref32
  doi: 10.1109/CVPR42600.2020.01112
– ident: ref28
  doi: 10.1109/CVPR42600.2020.00580
– ident: ref43
  doi: 10.3389/fnins.2015.00481
– volume: 30
  start-page: 1
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref40
  article-title: Attention is all you need
– ident: ref13
  doi: 10.1109/CVPR.2019.00401
– ident: ref30
  doi: 10.1109/ICCV.2019.00058
– ident: ref23
  doi: 10.1109/ICACI49185.2020.9177628
– ident: ref22
  doi: 10.1109/ICECS.2018.8617982
– ident: ref46
  doi: 10.1109/CVPR.2017.502
– ident: ref39
  doi: 10.1109/tpami.2022.3161735
– ident: ref11
  doi: 10.1109/TPAMI.2015.2392947
– ident: ref14
  doi: 10.1109/WACV.2019.00199
– ident: ref45
  doi: 10.1109/CVPR.2017.781
– ident: ref47
  doi: 10.1109/ICCV.2015.510
– ident: ref49
  doi: 10.1109/CVPR.2018.00675
– ident: ref10
  doi: 10.1007/s11263-014-0788-3
– ident: ref15
  doi: 10.1109/ISCAS45731.2020.9181247
– ident: ref25
  doi: 10.1109/TPAMI.2016.2574707
– ident: ref5
  doi: 10.1109/tcsvt.2022.3156653
– ident: ref50
  doi: 10.1109/TIP.2021.3077136
– ident: ref4
  doi: 10.1109/TCSVT.2020.3044287
– year: 2016
  ident: ref34
  article-title: Layer normalization
  publication-title: arXiv:1607.06450
– ident: ref37
  doi: 10.1109/CVPR46437.2021.01398
SSID ssj0014847
Score 2.5642264
Snippet The neuromorphic event cameras can efficiently sense the latent geometric structures and motion clues of a scene by generating asynchronous and sparse event...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 701
SubjectTerms action recognition
Brightness
Cameras
Cloud computing
Data mining
Event camera
Feature extraction
Machine learning
Modules
Moving object recognition
object classification
Representation learning
Representations
Robustness
Sparsity
spatio-temporal feature learning
Task analysis
Weight reduction
Title ECSNet: Spatio-Temporal Feature Learning for Event Camera
URI https://ieeexplore.ieee.org/document/9869656
https://www.proquest.com/docview/2773447991
Volume 33
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwFH4BTnrwFxpRNDt408Fat9J5MwRCTOACGG5L27150IDB7eJfb183CFFjvC1ZmzSvP9732ve-D-AGMULJROwbhtwPVaZ8xZBSoKQij8p4SvXO44kYzcOnRbSowd22FgYRXfIZdujTveWnK1PQVVk3liK2-KMOdRu4lbVa2xeDUDoxMQsXmC-tH9sUyARxd9afPs9sKMh5h9TCBfGS7jghp6ry4yh2_mV4COPNyMq0ktdOkeuO-fxG2vjfoR_BQQU0vcdyZRxDDZcnsL9DP9iEeNCfTjB_8KYurdqflTRVbx7hwmKNXkW--uJZZOsNKDPS6yu6xTqF-XAw64_8SkrBNzyOcr9njAwxRSYkU8puc26jUWlkZrTW9oeSmAUSbfTEsiAVkRHaSBFoa0mekRTEGTSWqyWeg4c2BhGZ1ipMMVQ9e0qhIWLFe4yFiXjWAraxbWIqnnGSu3hLXLwRxImbj4TmI6nmowW32z7vJcvGn62bZOBty8q2LWhvpjCpNuJHwns94jS0KPji916XsEcK8mUidhsa-brAK4szcn3tFtgXU0_NZg
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV1LT9wwEB4BPQCHtrzE8ig-tCeUJfYmXgepB7RdtLz2sgFxC7Yz4QBa0G5WqP0t_Sv8N8ZOdoVo1RtSb5FiJ5ZnZH9jf_MNwFfEGBWXSWA5iiDShQ40R0eBUtrtqFzkLt_5oi97l9HpdXw9B79nuTCI6Mln2HSP_i4_f7ATd1R2kCiZEP6oKZRn-POJArTx95MfZM1vQhx3004vqGsIBFYkcRm0rVUR5sil4lqTfwsKw5RVhTXG0AutsAgVUtjAizCXsZXGKhkaRaMsXA0E-u48fCCcEYsqO2x2RxEpX76MAAoPqHk8TckJk4O0M7hKKfgUounqk0unhPpq2_N1XP5Y_P2OdvwJnqdzURFZ7pqT0jTtrzcykf_rZH2GjzWUZkeV76_AHA5XYfmVwOIaJN3OoI_lIRt44niQVkJc98wh38kIWS0ve8sIu7Ou436yjnbndOtw-S5j34CF4cMQN4EhRVmyMEZHOUa6TeswWicd2cJE2lgUDeBTW2a2VlJ3BT3uMx9RhUnm7Z85-2e1_RuwP-vzWOmI_LP1mjPorGVtywbsTF0mq5eacSbabafaSDh_6--99mCxl16cZ-cn_bNtWKI_tCra-Q4slKMJ7hKqKs0X79wMbt7bQV4A-Ngrog
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=ECSNet%3A+Spatio-Temporal+Feature+Learning+for+Event+Camera&rft.jtitle=IEEE+transactions+on+circuits+and+systems+for+video+technology&rft.au=Chen%2C+Zhiwen&rft.au=Wu%2C+Jinjian&rft.au=Hou%2C+Junhui&rft.au=Li%2C+Leida&rft.date=2023-02-01&rft.issn=1051-8215&rft.eissn=1558-2205&rft.volume=33&rft.issue=2&rft.spage=701&rft.epage=712&rft_id=info:doi/10.1109%2FTCSVT.2022.3202659&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TCSVT_2022_3202659
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1051-8215&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1051-8215&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1051-8215&client=summon