MVF-Net: A Multi-View Fusion Network for Event-Based Object Classification

Event-based object recognition has drawn increasing attention for event cameras' distinguished advantages of low power consumption and high dynamic range. For this new modality, previous works based on customizing low-level descriptors are vulnerable to noise and with limited generalizability....

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on circuits and systems for video technology Vol. 32; no. 12; pp. 8275 - 8284
Main Authors Deng, Yongjian, Chen, Hao, Li, Youfu
Format Journal Article
LanguageEnglish
Published New York IEEE 01.12.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Event-based object recognition has drawn increasing attention for event cameras' distinguished advantages of low power consumption and high dynamic range. For this new modality, previous works based on customizing low-level descriptors are vulnerable to noise and with limited generalizability. Although recent works turn to design various deep neural networks to extract event features, they either suffer from data insufficiency to fully train the event-based model or fail to encode spatial and temporal cues simultaneously with their single view network. In this work, we address these limitations by proposing a multi-view attention-aware network, in which an event stream is projected to multi-view 2D maps to utilize well-trained 2D models and explore spatio-temporal complements. Besides, the attention mechanism is used to boost the complements in different streams for better joint inference. Comprehensive experiments show the large superiority of our model over state-of-the-art methods as well as the efficacy of our multi-view fusion framework for event data.
AbstractList Event-based object recognition has drawn increasing attention for event cameras’ distinguished advantages of low power consumption and high dynamic range. For this new modality, previous works based on customizing low-level descriptors are vulnerable to noise and with limited generalizability. Although recent works turn to design various deep neural networks to extract event features, they either suffer from data insufficiency to fully train the event-based model or fail to encode spatial and temporal cues simultaneously with their single view network. In this work, we address these limitations by proposing a multi-view attention-aware network, in which an event stream is projected to multi-view 2D maps to utilize well-trained 2D models and explore spatio-temporal complements. Besides, the attention mechanism is used to boost the complements in different streams for better joint inference. Comprehensive experiments show the large superiority of our model over state-of-the-art methods as well as the efficacy of our multi-view fusion framework for event data.
Author Deng, Yongjian
Li, Youfu
Chen, Hao
Author_xml – sequence: 1
  givenname: Yongjian
  orcidid: 0000-0001-6253-3564
  surname: Deng
  fullname: Deng, Yongjian
  email: yongjdeng2-c@my.cityu.edu.hk
  organization: Department of Mechanical Engineering, City University of Hong Kong, Hong Kong, SAR
– sequence: 2
  givenname: Hao
  orcidid: 0000-0002-3138-505X
  surname: Chen
  fullname: Chen, Hao
  email: haochen593@gmail.com
  organization: School of Computer Science and Engineering, Southeast University, Nanjing, China
– sequence: 3
  givenname: Youfu
  orcidid: 0000-0002-5227-1326
  surname: Li
  fullname: Li, Youfu
  email: meyfli@cityu.edu.hk
  organization: Department of Mechanical Engineering, City University of Hong Kong, Hong Kong, SAR
BookMark eNp9kMlOwzAQhi0EEm3hBeBiiXOKlzhOuJWqZVFLD5ReLcexJZeQFNuh4u1xF3HggOYwI83_zfL3wWnTNhqAK4yGGKPidjl-XS2HBBE8pIjTjNMT0MOM5QkhiJ3GGjGc5ASzc9D3fo0QTvOU98DzfDVNXnS4gyM47-pgk5XVWzjtvG0bGBvb1r1D0zo4-dJNSO6l1xVclGutAhzX0ntrrJIhqi_AmZG115fHPABv08ly_JjMFg9P49EsUaRgIWGq4NpIwilDHBVGZZnOecUqXDGiVamoLiuKlSxKiUyRpWmFmKKEpIXhHDE6ADeHuRvXfnbaB7FuO9fElYLwlGc4fpxHVX5QKdd677QRyob9ncFJWwuMxM45sXdO7JwTR-ciSv6gG2c_pPv-H7o-QFZr_QsUKcpi0B_QbHqL
CODEN ITCTEM
CitedBy_id crossref_primary_10_1109_TCSVT_2023_3322470
crossref_primary_10_3390_electronics14010043
crossref_primary_10_1109_TCSVT_2023_3263853
crossref_primary_10_1007_s10489_023_04963_0
crossref_primary_10_1109_TCSVT_2023_3272375
crossref_primary_10_1109_TCSVT_2024_3443122
crossref_primary_10_3390_s22166090
crossref_primary_10_1016_j_infrared_2021_103823
crossref_primary_10_1016_j_ins_2023_119170
crossref_primary_10_1109_TGRS_2024_3366536
crossref_primary_10_1109_TMM_2024_3380255
crossref_primary_10_1016_j_neucom_2024_128268
crossref_primary_10_1002_aisy_202400265
crossref_primary_10_1109_LRA_2022_3140819
crossref_primary_10_1109_TCSVT_2024_3359681
crossref_primary_10_1109_TCSVT_2024_3448615
crossref_primary_10_1016_j_eswa_2023_122743
crossref_primary_10_1016_j_neucom_2025_130026
crossref_primary_10_3390_electronics12173567
crossref_primary_10_1109_TCSVT_2023_3301176
crossref_primary_10_1109_JSEN_2024_3524301
crossref_primary_10_1109_TCSVT_2022_3189480
Cites_doi 10.1007/978-3-030-58565-5_9
10.1109/CVPR.2018.00745
10.1109/TCYB.2019.2934986
10.1109/CVPR.2019.00398
10.1109/TCSVT.2017.2786999
10.3389/fnins.2017.00309
10.1109/ICCV.2015.114
10.1109/TPAMI.2015.2392947
10.1109/CVPR.2018.00186
10.1177/0278364917691115
10.1109/ICCV.2019.00573
10.1109/TNNLS.2013.2273537
10.1109/ICCV.2019.00760
10.1109/LRA.2020.3002480
10.1109/THMS.2015.2504550
10.2139/ssrn.3349696
10.1109/TIP.2020.3023597
10.1109/CVPR.2018.00568
10.1109/TIP.2019.2891104
10.1109/JSSC.2010.2085952
10.1145/2964284.2967191
10.5244/C.30.94
10.1016/S0893-6080(97)00011-7
10.1109/JSSC.2007.914337
10.1109/ICCV.2019.00706
10.1109/CVPR.2017.602
10.1109/CVPR.2017.781
10.1016/j.neunet.2015.02.013
10.1109/ICCV.2019.00058
10.3389/fnins.2015.00137
10.1109/TIP.2020.3013168
10.1016/j.patcog.2019.107037
10.1109/TCSVT.2018.2864148
10.1109/TIP.2020.3014734
10.1109/TNNLS.2018.2876865
10.3389/fnins.2015.00437
10.1109/IROS.2016.7759610
10.15607/RSS.2018.XIV.062
10.1109/TPAMI.2016.2574707
10.1109/JPROC.2014.2346763
10.1109/CVPR.2016.90
10.1109/ICCV.2017.74
10.1109/CVPR.2009.5206848
10.1109/TCSVT.2018.2834480
10.1109/TCSVT.2014.2333151
10.1109/TPAMI.2017.2769655
10.5244/C.31.33
10.3389/fnins.2016.00594
10.1109/CVPR42600.2020.00192
10.1109/TNNLS.2014.2362542
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TCSVT.2021.3073673
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Xplore
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1558-2205
EndPage 8284
ExternalDocumentID 10_1109_TCSVT_2021_3073673
9406060
Genre orig-research
GrantInformation_xml – fundername: Research Grants Council of Hong Kong
  grantid: CityU 11213420
  funderid: 10.13039/501100002920
– fundername: National Natural Science Foundation of China
  grantid: 61873220
  funderid: 10.13039/501100001809
– fundername: Science and Technology Development Fund, Macau
  grantid: 0022/2019/AKP
  funderid: 10.13039/501100003009
GroupedDBID -~X
0R~
29I
4.4
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
HZ~
H~9
ICLAB
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
O9-
OCL
P2P
RIA
RIE
RNS
RXW
TAE
TN5
VH1
AAYXX
CITATION
RIG
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c295t-5c97efa27350709fc66e87d5d1d52ecbc3ebd31ca9ba0f9644d05c32249f77053
IEDL.DBID RIE
ISSN 1051-8215
IngestDate Mon Jun 30 03:29:40 EDT 2025
Tue Jul 01 00:41:15 EDT 2025
Thu Apr 24 23:07:31 EDT 2025
Wed Aug 27 02:15:00 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 12
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c295t-5c97efa27350709fc66e87d5d1d52ecbc3ebd31ca9ba0f9644d05c32249f77053
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-3138-505X
0000-0001-6253-3564
0000-0002-5227-1326
PQID 2747612058
PQPubID 85433
PageCount 10
ParticipantIDs proquest_journals_2747612058
ieee_primary_9406060
crossref_citationtrail_10_1109_TCSVT_2021_3073673
crossref_primary_10_1109_TCSVT_2021_3073673
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2022-12-01
PublicationDateYYYYMMDD 2022-12-01
PublicationDate_xml – month: 12
  year: 2022
  text: 2022-12-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on circuits and systems for video technology
PublicationTitleAbbrev TCSVT
PublicationYear 2022
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref57
ref13
ref56
ref12
ref15
simonyan (ref7) 2015
zihao zhu (ref11) 2018
ref14
ref53
ref52
ref10
ref17
ref16
ref19
ref18
neil (ref41) 2016
ref51
ref50
etienne perot (ref4) 2020
ref46
gallego (ref3) 2020
ref45
ref48
ref47
ref42
ref44
ref43
ref49
ref8
ref9
ref6
ref5
ref40
ref35
ref34
ref37
ref36
ref31
ref30
ref33
ref32
ref2
ref1
ref39
kingma (ref54) 2014
ref38
ref24
sutskever (ref55) 2013
ref23
ref26
ref25
ref20
ref22
ref21
ref28
ref27
ref29
References_xml – ident: ref14
  doi: 10.1007/978-3-030-58565-5_9
– ident: ref17
  doi: 10.1109/CVPR.2018.00745
– ident: ref35
  doi: 10.1109/TCYB.2019.2934986
– ident: ref12
  doi: 10.1109/CVPR.2019.00398
– ident: ref30
  doi: 10.1109/TCSVT.2017.2786999
– ident: ref53
  doi: 10.3389/fnins.2017.00309
– ident: ref45
  doi: 10.1109/ICCV.2015.114
– ident: ref38
  doi: 10.1109/TPAMI.2015.2392947
– ident: ref43
  doi: 10.1109/CVPR.2018.00186
– ident: ref26
  doi: 10.1177/0278364917691115
– ident: ref13
  doi: 10.1109/ICCV.2019.00573
– ident: ref21
  doi: 10.1109/TNNLS.2013.2273537
– ident: ref46
  doi: 10.1109/ICCV.2019.00760
– year: 2014
  ident: ref54
  article-title: Adam: A method for stochastic optimization
  publication-title: arXiv 1412 6980
– ident: ref15
  doi: 10.1109/LRA.2020.3002480
– ident: ref49
  doi: 10.1109/THMS.2015.2504550
– start-page: 1
  year: 2015
  ident: ref7
  article-title: Very deep convolutional networks for large-scale image recognition
  publication-title: Proc Int Conf Learn Represent
– ident: ref27
  doi: 10.2139/ssrn.3349696
– ident: ref6
  doi: 10.1109/TIP.2020.3023597
– ident: ref9
  doi: 10.1109/CVPR.2018.00568
– ident: ref36
  doi: 10.1109/TIP.2019.2891104
– ident: ref2
  doi: 10.1109/JSSC.2010.2085952
– ident: ref48
  doi: 10.1145/2964284.2967191
– ident: ref25
  doi: 10.5244/C.30.94
– ident: ref5
  doi: 10.1016/S0893-6080(97)00011-7
– ident: ref1
  doi: 10.1109/JSSC.2007.914337
– ident: ref33
  doi: 10.1109/ICCV.2019.00706
– ident: ref50
  doi: 10.1109/CVPR.2017.602
– ident: ref40
  doi: 10.1109/CVPR.2017.781
– ident: ref18
  doi: 10.1016/j.neunet.2015.02.013
– start-page: 1
  year: 2020
  ident: ref4
  article-title: Learning to detect objects with a 1 megapixel event camera
  publication-title: Proc Conf Neural Inf Process Syst
– year: 2020
  ident: ref3
  article-title: Event-based vision: A survey
  publication-title: IEEE Trans Pattern Anal Mach Intell
– ident: ref44
  doi: 10.1109/ICCV.2019.00058
– start-page: 3882
  year: 2016
  ident: ref41
  article-title: Phased LSTM: Accelerating recurrent network training for long or event-based sequences
  publication-title: Proc Conf Neural Inf Process Syst
– start-page: 1139
  year: 2013
  ident: ref55
  article-title: On the importance of initialization and momentum in deep learning
  publication-title: Proc Int Conf Mach Learn
– ident: ref23
  doi: 10.3389/fnins.2015.00137
– ident: ref32
  doi: 10.1109/TIP.2020.3013168
– start-page: 711
  year: 2018
  ident: ref11
  article-title: Unsupervised event-based optical flow using motion compensation
  publication-title: Proc Eur Conf Comput Vis
– ident: ref34
  doi: 10.1016/j.patcog.2019.107037
– ident: ref28
  doi: 10.1109/TCSVT.2018.2864148
– ident: ref37
  doi: 10.1109/TIP.2020.3014734
– ident: ref31
  doi: 10.1109/TNNLS.2018.2876865
– ident: ref52
  doi: 10.3389/fnins.2015.00437
– ident: ref20
  doi: 10.1109/IROS.2016.7759610
– ident: ref10
  doi: 10.15607/RSS.2018.XIV.062
– ident: ref42
  doi: 10.1109/TPAMI.2016.2574707
– ident: ref24
  doi: 10.1109/JPROC.2014.2346763
– ident: ref8
  doi: 10.1109/CVPR.2016.90
– ident: ref56
  doi: 10.1109/ICCV.2017.74
– ident: ref16
  doi: 10.1109/CVPR.2009.5206848
– ident: ref57
  doi: 10.1109/TCSVT.2018.2834480
– ident: ref29
  doi: 10.1109/TCSVT.2014.2333151
– ident: ref51
  doi: 10.1109/TPAMI.2017.2769655
– ident: ref19
  doi: 10.5244/C.31.33
– ident: ref22
  doi: 10.3389/fnins.2016.00594
– ident: ref47
  doi: 10.1109/CVPR42600.2020.00192
– ident: ref39
  doi: 10.1109/TNNLS.2014.2362542
SSID ssj0014847
Score 2.5980427
Snippet Event-based object recognition has drawn increasing attention for event cameras' distinguished advantages of low power consumption and high dynamic range. For...
Event-based object recognition has drawn increasing attention for event cameras’ distinguished advantages of low power consumption and high dynamic range. For...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 8275
SubjectTerms Artificial neural networks
attention
Cameras
Data models
Event data
Feature extraction
multi-view
object categorization
Object recognition
Power consumption
Power demand
Streaming media
Task analysis
Three-dimensional displays
Two dimensional models
Title MVF-Net: A Multi-View Fusion Network for Event-Based Object Classification
URI https://ieeexplore.ieee.org/document/9406060
https://www.proquest.com/docview/2747612058
Volume 32
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEB6qJz34qmJ9kYM3TU32lY23Wiyl0HqwFm_L5rEgSiu6RfDXO8luiy9E9rKHZAkzyc43k29mAE61YdwUmtPYpDmNtBJUIcyloeQhV5FSwpMxh6OkfxcN7uP7Bpwvc2GstZ58Ztvu1d_lm5meu1DZhUTrg88KrKDjVuVqLW8MotQ3E0O4wGmKdmyRIMPkxbh7OxmjKxjwttvRiQi_GCHfVeXHr9jbl94mDBcrq2glj-15qdr6_VvRxv8ufQs2aqBJOtXO2IaGne7A-qfyg00YDCc9OrLlJekQn4hLJw_2jfTmLoJGRhVBnCCqJdeOFUmv0OIZcqNc6Ib4bpqOZ-RVuwt3vetxt0_r3gpUBzIuaaylsEWO4AUBIZOFThKbChMbbuLAaqVDq0zIdS5VzgqJqMmwWOPpj2QhBJ7cPVidzqZ2HwiPTKE0TzSzaaSFyCMHQ_H7LJAmLYoW8IWwM10XHnf9L54y74AwmXkFZU5BWa2gFpwt5zxXZTf-HN10El-OrIXdgqOFTrP6ZL5mzgtHVMfi9OD3WYewFrgUB09ZOYLV8mVujxF4lOrE77gPnVfSaA
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV1NT9wwEB0heoAeSoFWLGxbH-gJebGTOI4rcaCU1fKx20OXFbc0_oiEihYEWa3a39K_0v_G2MmuaIu4IaFccrAjJfPieR6_mQHYNpZxWxpOhc0KmhgtqUaaS2PFY64TrWUQY_YHae8sOT4X5wvwe54L45wL4jPX8bfhLN9emYkPle0q9D54NRLKE_dzihu0272jL2jNj1HUPRwe9GjTQ4CaSImKCqOkKwt00kh8mCpNmrpMWmG5FZEz2sRO25ibQumClQrZgWXCIMoTVUrJfE8IXOBfIM8QUZ0dNj-jSLLQvgwJCqcZes5ZSg5Tu8ODb6Mhbj4j3vH_UCrjv9xe6OPy3-IfPFp3Bf7MvkUtZPnRmVS6Y379UybyuX6s1_CqodJkv8b-Kiy48Rq8vFdgcR2O-6MuHbjqE9knIdWYji7clHQnPkZIBrUEniBvJ4de90k_o0-35Kv2wSkS-oV6JVUA7xs4e5KXeQuL46ux2wDCE1tqw1PDXJYYKYvEE218PouUzcqyBXxm3Nw0pdV9h4_LPGyxmMoDIHIPiLwBRAt25nOu68Iij45e9xaej2yM24L2DEN5s_bc5j7OgLyViWzz4VkfYKk37J_mp0eDky1YjnxCRxDotGGxupm4d0izKv0-oJ3A96dGzB38OTBT
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=MVF-Net%3A+A+Multi-View+Fusion+Network+for+Event-Based+Object+Classification&rft.jtitle=IEEE+transactions+on+circuits+and+systems+for+video+technology&rft.au=Deng%2C+Yongjian&rft.au=Chen%2C+Hao&rft.au=Li%2C+Youfu&rft.date=2022-12-01&rft.issn=1051-8215&rft.eissn=1558-2205&rft.volume=32&rft.issue=12&rft.spage=8275&rft.epage=8284&rft_id=info:doi/10.1109%2FTCSVT.2021.3073673&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TCSVT_2021_3073673
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1051-8215&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1051-8215&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1051-8215&client=summon