Deep Convolutional Correlation Filter Learning Toward Robust Visual Object Tracking

Recently, convolutional neural network has been pervasively adopted in visual object tracking for its potential in discriminating the target from the surrounding background. Most of the visual object trackers extract deep features from a specific layer, generally from the last convolutional layer. H...

Full description

Saved in:

Bibliographic Details
Published in	Chinese Control and Decision Conference pp. 4313 - 4320
Main Authors	Bouraffa, Tayssir, Feng, Zihang, Wang, Yuxuan, Yan, Liping, Xia, Yuanqing, Xiao, Bo
Format	Conference Proceeding
Language	English
Published	IEEE 15.08.2022
Subjects	convolutional neural network Correlation correlation filters Feature extraction hierarchical features Information filters Semantics Target tracking Video sequences Visual tracking Visualization
Online Access	Get full text
ISSN	1948-9447
DOI	10.1109/CCDC55256.2022.10034306

Cover

Abstract	Recently, convolutional neural network has been pervasively adopted in visual object tracking for its potential in discriminating the target from the surrounding background. Most of the visual object trackers extract deep features from a specific layer, generally from the last convolutional layer. However, these trackers are less effective, especially when the target undergoes drastic appearance variations caused by the presence of different challenging situations, such as occlusion, illumination change, background clutter and so on. In this research paper, a novel tracking algorithm is developed by introducing an elastic net constraint and a contextual information into the convolutional network to successfully track the desired target throughout a video sequence. Hierarchical features are extracted from the shallow and the deep convolutional layers to further improve the tracking accuracy and robustness. As the deep convolutional layers capture important semantic information, they are more robust to the target appearance variations. As for the shallow convolutional layers, they encode significant spatial details, which are more accurate to precisely localize the desired target. Moreover, Peak-Strength Context-Aware correlation filters are embedded to each convolutional layer output that produce multi-level convolutional response maps to collaboratively identify the estimated position of the target in a coarse-to-fine manner. Quantitative and qualitative experiments are performed on the widely used benchmark, the OTB-2015 dataset that shows impressive results compared to the state-of-the-art trackers.
AbstractList	Recently, convolutional neural network has been pervasively adopted in visual object tracking for its potential in discriminating the target from the surrounding background. Most of the visual object trackers extract deep features from a specific layer, generally from the last convolutional layer. However, these trackers are less effective, especially when the target undergoes drastic appearance variations caused by the presence of different challenging situations, such as occlusion, illumination change, background clutter and so on. In this research paper, a novel tracking algorithm is developed by introducing an elastic net constraint and a contextual information into the convolutional network to successfully track the desired target throughout a video sequence. Hierarchical features are extracted from the shallow and the deep convolutional layers to further improve the tracking accuracy and robustness. As the deep convolutional layers capture important semantic information, they are more robust to the target appearance variations. As for the shallow convolutional layers, they encode significant spatial details, which are more accurate to precisely localize the desired target. Moreover, Peak-Strength Context-Aware correlation filters are embedded to each convolutional layer output that produce multi-level convolutional response maps to collaboratively identify the estimated position of the target in a coarse-to-fine manner. Quantitative and qualitative experiments are performed on the widely used benchmark, the OTB-2015 dataset that shows impressive results compared to the state-of-the-art trackers.
Author	Bouraffa, Tayssir Xia, Yuanqing Feng, Zihang Wang, Yuxuan Yan, Liping Xiao, Bo
Author_xml	– sequence: 1 givenname: Tayssir surname: Bouraffa fullname: Bouraffa, Tayssir organization: Beijing Institute of Technology,Key Laboratory of Intelligent Control and Decision of Complex Systems,School of Automation,Beijing,P. R. China,100081 – sequence: 2 givenname: Zihang surname: Feng fullname: Feng, Zihang organization: Beijing Institute of Technology,Key Laboratory of Intelligent Control and Decision of Complex Systems,School of Automation,Beijing,P. R. China,100081 – sequence: 3 givenname: Yuxuan surname: Wang fullname: Wang, Yuxuan organization: Beijing Institute of Technology,Key Laboratory of Intelligent Control and Decision of Complex Systems,School of Automation,Beijing,P. R. China,100081 – sequence: 4 givenname: Liping surname: Yan fullname: Yan, Liping email: ylp@bit.edu.cn organization: Beijing Institute of Technology,Key Laboratory of Intelligent Control and Decision of Complex Systems,School of Automation,Beijing,P. R. China,100081 – sequence: 5 givenname: Yuanqing surname: Xia fullname: Xia, Yuanqing organization: Beijing Institute of Technology,Key Laboratory of Intelligent Control and Decision of Complex Systems,School of Automation,Beijing,P. R. China,100081 – sequence: 6 givenname: Bo surname: Xiao fullname: Xiao, Bo organization: Beijing University of Posts and Telecommunications,School of Artificial Intelligence,Beijing,P. R. China,100876
BookMark	eNo1kM1KxDAUhaMoOB19A8G8QOtN89NkKR1HhcKAVrdDkt5KxtoOaUfx7a2oq8MH5zuLk5CTfuiRkCsGGWNgrstyVUqZS5XlkOcZA-CCgzoiCVNKikIbZY7JghmhUyNEcUaScdwBKMUBFuRphbin5dB_DN1hCkNvu5lixM7-EF2HbsJIK7SxD_0rrYdPGxv6OLjDONGXMB5mYeN26CdaR-vf5tI5OW1tN-LFXy7J8_q2Lu_TanP3UN5UachBTKlGL7htGmd0a7xHKJQCJ7nRWnslpfdGScutcy0HLn2Rc10oAOcl2NYwviSXv7sBEbf7GN5t_Nr-P8C_AaXuUwI
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/CCDC55256.2022.10034306
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISBN	1665478969 9781665478960
EISSN	1948-9447
EndPage	4320
ExternalDocumentID	10034306
Genre	orig-research
GrantInformation_xml	– fundername: National Natural Science Foundation of China funderid: 10.13039/501100001809
GroupedDBID	29B 6IE 6IF 6IK 6IL 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL
ID	FETCH-LOGICAL-i204t-8ec43addb98f9cce07660b539888c655cc965a3abbf3035c72387600bc50af913
IEDL.DBID	RIE
IngestDate	Wed Aug 27 02:58:31 EDT 2025
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i204t-8ec43addb98f9cce07660b539888c655cc965a3abbf3035c72387600bc50af913
PageCount	8
ParticipantIDs	ieee_primary_10034306
PublicationCentury	2000
PublicationDate	2022-Aug.-15
PublicationDateYYYYMMDD	2022-08-15
PublicationDate_xml	– month: 08 year: 2022 text: 2022-Aug.-15 day: 15
PublicationDecade	2020
PublicationTitle	Chinese Control and Decision Conference
PublicationTitleAbbrev	CCDC
PublicationYear	2022
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0066300
Score	1.8062396
Snippet	Recently, convolutional neural network has been pervasively adopted in visual object tracking for its potential in discriminating the target from the...
SourceID	ieee
SourceType	Publisher
StartPage	4313
SubjectTerms	convolutional neural network Correlation correlation filters Feature extraction hierarchical features Information filters Semantics Target tracking Video sequences Visual tracking Visualization
Title	Deep Convolutional Correlation Filter Learning Toward Robust Visual Object Tracking
URI	https://ieeexplore.ieee.org/document/10034306
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELagEyy8injLA6uDU8dOPKdUFRIFQYu6VfbFQRWordqEgV_POU14SUhscWTLkX2-L77Hd4RcKhCZMUYwaaxjEXDDLJ45lqGs5DqPeZJ508DtQPVH0c1Yjutk9SoXxjlXBZ-5wD9WvvxsDqU3leEJ5yISnmB7E-VsnazVqF3luaPqAK6Q66s07aZSIqDjHbDTCZqhP4qoVBjS2yGDZvZ16MhLUBY2gPdfxIz__rxd0v5K16P3n0C0RzbcbJ9sf2MaPCCPXecWFPu_1aJmXrG1XNahcLQ39V5zWrOtPtNhFU1LH-a2XBX0aboqccCd9UYbivAG3sDeJqPe9TDts7qeApt2eFSwxEEkUJ9ZneQawPFYKW6l0HgLBiUlgFbSCGNtjsAmwdcj8347C5KbXIfikLRm85k7IjSWeR46UAn-_6CqFZpnNsySSKvIgQ7hmLT9-kwWa8qMSbM0J3-8PyVbfpu8sTaUZ6RVLEt3jmhf2Itqlz8A3P6p2g
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3JTsMwELUQHIALWxE7PnBNcOrYjc8pVYG2IGhRb5U9cVAFaqs24cDXM04TNgmJWxLZSuRlXvxm5g0hFxJ4orXmntDGeiEw7Rncc16CayVVaYNFiaMGuj3ZHoQ3QzEsk9WLXBhrbRF8Zn13WfjykynkjirDHc54yJ3A9hoCfyiW6VqV4ZVOPaoM4QqYuozjZiwEQjqeAut1v-r8o4xKgSKtLdKr3r8MHnnx88z48P5LmvHfH7hNal8Je_T-E4p2yIqd7JLNb1qDe-Sxae2MYvu3crHpV7ybz8tgONoaO785LfVWn2m_iKelD1OTLzL6NF7k2OHOONqGIsCBo9hrZNC66sdtr6yo4I3rLMy8yELI0aIZFaUKwLKGlMwIrvAcDFIIACWF5tqYFKFNgKtI5jx3BgTTqQr4PlmdTCf2gNCGSNPAgozwDwiNLVcsMUEShUqGFlQAh6Tmxmc0W4pmjKqhOfrj-TlZb_e7nVHnund7TDbclDnqNhAnZDWb5_YUsT8zZ8WMfwDyZK0n
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Chinese+Control+and+Decision+Conference&rft.atitle=Deep+Convolutional+Correlation+Filter+Learning+Toward+Robust+Visual+Object+Tracking&rft.au=Bouraffa%2C+Tayssir&rft.au=Feng%2C+Zihang&rft.au=Wang%2C+Yuxuan&rft.au=Yan%2C+Liping&rft.date=2022-08-15&rft.pub=IEEE&rft.eissn=1948-9447&rft.spage=4313&rft.epage=4320&rft_id=info:doi/10.1109%2FCCDC55256.2022.10034306&rft.externalDocID=10034306