Multi-scale aggregation network for temporal action proposals
•We propose a network with temporal multi-layer dilated convolution for TAP.•We propose the unit level and proposal level multi-scale aggregation strategies.•We propose to take the soft labelling to facilitate action boundary unit prediction. Temporal action detection is a very challenging and valua...
Saved in:
Published in | Pattern recognition letters Vol. 122; pp. 60 - 65 |
---|---|
Main Authors | , , , , , , |
Format | Journal Article |
Language | English |
Published |
Amsterdam
Elsevier B.V
01.05.2019
Elsevier Science Ltd |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | •We propose a network with temporal multi-layer dilated convolution for TAP.•We propose the unit level and proposal level multi-scale aggregation strategies.•We propose to take the soft labelling to facilitate action boundary unit prediction.
Temporal action detection is a very challenging and valuable task for video analysis and applications. The detection results, to a great extent, rely on the quality of temporal action proposals. However, temporal actions in videos vary dramatically, e.g. from a fraction of a second to minutes, which causes much difficulties for accurate temporal action proposals. In this paper, we propose a multi-scale aggregation network to overcome those variations for temporal action proposals. Our proposed network generates an actionness score sequence for the input video to automatically perceive the duration of actions, and thus can dynamically generate corresponding lengths of action proposals for them. For more reliable actionness prediction, we propose to adaptively explore the intrinsic short and long dependencies in action by two multi-scale aggregation strategies: unit level multi-scale aggregation and proposal level multi-scale aggregation. We also propose to take the soft labelling to facilitate the actionness prediction for the units near the action boundaries. Extensive experiments on THUMOS14 dataset have demonstrated the effectiveness of our proposed method. |
---|---|
AbstractList | Temporal action detection is a very challenging and valuable task for video analysis and applications. The detection results, to a great extent, rely on the quality of temporal action proposals. However, temporal actions in videos vary dramatically, e.g. from a fraction of a second to minutes, which causes much difficulties for accurate temporal action proposals. In this paper, we propose a multi-scale aggregation network to overcome those variations for temporal action proposals. Our proposed network generates an actionness score sequence for the input video to automatically perceive the duration of actions, and thus can dynamically generate corresponding lengths of action proposals for them. For more reliable actionness prediction, we propose to adaptively explore the intrinsic short and long dependencies in action by two multi-scale aggregation strategies: unit level multi-scale aggregation and proposal level multi-scale aggregation. We also propose to take the soft labelling to facilitate the actionness prediction for the units near the action boundaries. Extensive experiments on THUMOS14 dataset have demonstrated the effectiveness of our proposed method. •We propose a network with temporal multi-layer dilated convolution for TAP.•We propose the unit level and proposal level multi-scale aggregation strategies.•We propose to take the soft labelling to facilitate action boundary unit prediction. Temporal action detection is a very challenging and valuable task for video analysis and applications. The detection results, to a great extent, rely on the quality of temporal action proposals. However, temporal actions in videos vary dramatically, e.g. from a fraction of a second to minutes, which causes much difficulties for accurate temporal action proposals. In this paper, we propose a multi-scale aggregation network to overcome those variations for temporal action proposals. Our proposed network generates an actionness score sequence for the input video to automatically perceive the duration of actions, and thus can dynamically generate corresponding lengths of action proposals for them. For more reliable actionness prediction, we propose to adaptively explore the intrinsic short and long dependencies in action by two multi-scale aggregation strategies: unit level multi-scale aggregation and proposal level multi-scale aggregation. We also propose to take the soft labelling to facilitate the actionness prediction for the units near the action boundaries. Extensive experiments on THUMOS14 dataset have demonstrated the effectiveness of our proposed method. |
Author | Zhang, Mingxing Wang, Zheng Chen, Kai He, Peilin Wang, Yajie Zhu, Ping Yang, Yang |
Author_xml | – sequence: 1 givenname: Zheng surname: Wang fullname: Wang, Zheng organization: University of Electronic Science and Technology of China, No.2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu, 611731, China – sequence: 2 givenname: Kai surname: Chen fullname: Chen, Kai email: chenkai@gzbdi.com organization: Guizhou Food Safety Inspection and Application Engineering Technology Research Center Co.,Ltd, Changling South Road, National High-tech Zone, Guiyang, 550002, China – sequence: 3 givenname: Mingxing surname: Zhang fullname: Zhang, Mingxing organization: University of Electronic Science and Technology of China, No.2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu, 611731, China – sequence: 4 givenname: Peilin surname: He fullname: He, Peilin email: peilinhe@qq.com organization: University of Electronic Science and Technology of China, No.2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu, 611731, China – sequence: 5 givenname: Yajie surname: Wang fullname: Wang, Yajie email: wangyajie@gzbdi.com organization: Guizhou Food Safety Inspection and Application Engineering Technology Research Center Co.,Ltd, Changling South Road, National High-tech Zone, Guiyang, 550002, China – sequence: 6 givenname: Ping surname: Zhu fullname: Zhu, Ping email: lange@gzata.cn organization: Guizhou Food Safety Inspection and Application Engineering Technology Research Center Co.,Ltd, Changling South Road, National High-tech Zone, Guiyang, 550002, China – sequence: 7 givenname: Yang orcidid: 0000-0002-5070-4511 surname: Yang fullname: Yang, Yang email: dlyyang@gmail.com organization: University of Electronic Science and Technology of China, No.2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu, 611731, China |
BookMark | eNp9kLtOwzAUhi1UJNrCGzBEYk7wtbYHkFDFTSpigdly3JMqIY2D7YJ4ewxhZvqH8190vgWaDX4AhM4Jrggmq8uuGm0K4CqKia4wrTCWR2hOlKSlZJzP0DzbZKlWQpygRYwdxnjFtJqjq6dDn9oyOttDYXe7ADubWj8UA6RPH96KxociwX70wfaFdb-3MfjRR9vHU3TcZIGzP12i17vbl_VDuXm-f1zfbErHGE-lFLWrHdPC1lTVThMsgHLspAUJAnPdbLXFTmgmmVKMcE5UQ0n-RjSCqoYt0cXUm5ffDxCT6fwhDHnSUEoUw0prmV18crngYwzQmDG0exu-DMHmB5TpzATK_IAymJoMKseupxjkDz5aCCa6FgYH2zZbk9n69v-Cb-E1dEw |
CitedBy_id | crossref_primary_10_1016_j_ipm_2019_102130 crossref_primary_10_1109_TMM_2021_3053775 crossref_primary_10_1007_s11280_022_01058_7 crossref_primary_10_1007_s13042_023_01774_0 crossref_primary_10_1016_j_jvcir_2020_102934 |
Cites_doi | 10.1109/TIP.2018.2855422 10.1109/TIP.2017.2676345 10.1109/TIP.2018.2855415 10.1109/TMM.2017.2729019 10.1109/TIP.2017.2655449 10.1109/TCYB.2018.2831447 10.1109/TNNLS.2018.2851077 10.3233/FI-2000-411207 10.1016/j.patcog.2017.08.029 10.1109/TIP.2016.2614136 10.1016/j.sigpro.2017.12.008 |
ContentType | Journal Article |
Copyright | 2019 Copyright Elsevier Science Ltd. May 1, 2019 |
Copyright_xml | – notice: 2019 – notice: Copyright Elsevier Science Ltd. May 1, 2019 |
DBID | AAYXX CITATION 7SC 7TK 8FD JQ2 L7M L~C L~D |
DOI | 10.1016/j.patrec.2019.02.007 |
DatabaseName | CrossRef Computer and Information Systems Abstracts Neurosciences Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest Computer Science Collection Computer and Information Systems Abstracts Neurosciences Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Technology Research Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Computer Science |
EISSN | 1872-7344 |
EndPage | 65 |
ExternalDocumentID | 10_1016_j_patrec_2019_02_007 S0167865518307840 |
GroupedDBID | --M .DC .~1 0R~ 123 1RT 1~. 1~5 4.4 457 4G. 53G 5VS 7-5 71M 8P~ 9JN AABNK AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAXUO AAYFN ABBOA ABFNM ABFRF ABJNI ABMAC ABYKQ ACDAQ ACGFO ACGFS ACRLP ACZNC ADBBV ADEZE ADTZH AEBSH AECPX AEFWE AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD AXJTR BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EO8 EO9 EP2 EP3 F5P FDB FIRID FNPLU FYGXN G-Q GBLVA GBOLZ J1W JJJVA KOM LG9 LY1 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 RIG RNS ROL SDF SDG SDP SES SPC SPCBC SST SSV SSZ T5K TN5 UNMZH WH7 XPP ZMT ~G- --K 1B1 29O AAQXK AAXKI AAYXX ABDPE ABXDB ACNNM ACRPL ADJOM ADMUD ADMXK ADNMO AFJKZ AKRWK ASPBG AVWKF AZFZN CITATION EJD FEDTE FGOYB HLZ HVGLF HZ~ IHE R2- RPZ SBC SDS SEW VOH WUQ Y6R 7SC 7TK 8FD JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c334t-75bcbc395ab28bc9105e240c7ae7e5049fd9a0c5937388314418f212015f528f3 |
IEDL.DBID | AIKHN |
ISSN | 0167-8655 |
IngestDate | Thu Oct 10 22:01:02 EDT 2024 Fri Dec 06 04:03:50 EST 2024 Fri Feb 23 02:24:34 EST 2024 |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c334t-75bcbc395ab28bc9105e240c7ae7e5049fd9a0c5937388314418f212015f528f3 |
ORCID | 0000-0002-5070-4511 |
PQID | 2218308997 |
PQPubID | 2047552 |
PageCount | 6 |
ParticipantIDs | proquest_journals_2218308997 crossref_primary_10_1016_j_patrec_2019_02_007 elsevier_sciencedirect_doi_10_1016_j_patrec_2019_02_007 |
PublicationCentury | 2000 |
PublicationDate | 2019-05-01 2019-05-00 20190501 |
PublicationDateYYYYMMDD | 2019-05-01 |
PublicationDate_xml | – month: 05 year: 2019 text: 2019-05-01 day: 01 |
PublicationDecade | 2010 |
PublicationPlace | Amsterdam |
PublicationPlace_xml | – name: Amsterdam |
PublicationTitle | Pattern recognition letters |
PublicationYear | 2019 |
Publisher | Elsevier B.V Elsevier Science Ltd |
Publisher_xml | – name: Elsevier B.V – name: Elsevier Science Ltd |
References | Shen, Xu, Liu, Yang, Huang, Shen (bib0018) 2018 Roerdink, Meijster (bib0017) 2000; 41 Wang, Schmid (bib0025) 2013 Caba Heilbron, Carlos Niebles, Ghanem (bib0002) 2016 Gao, Yang, Nevatia (bib0007) 2017 Ioffe, Szegedy (bib0010) 2015; Volume 37 Wang, Qiao, Tang (bib0026) 2014; 1 Yang, Zhou, Ai, Bin, Hanjalic, Shen (bib0032) 2018 Zhang, Yang, Ji, Xie, Shen (bib0038) 2018; 145 Dai, Singh, Zhang, Davis, Chen (bib0005) 2017 Shou, Chan, Zareian, Miyazawa, Chang (bib0019) 2017 Yu, Koltun, Funkhouser (bib0035) 2017; 2 Wang, Xiong, Wang, Qiao, Lin, Tang, Van Gool (bib0027) 2017 Lin, Zhao, Shou (bib0013) 2017 Gao, Yang, Sun, Chen, Nevatia (bib0008) 2017 Xu, Das, Saenko (bib0030) 2017 Bin, Yang, Shen, Xie, Shen, Li (bib0001) 2018 Chao, Vijayanarasimhan, Seybold, Ross, Deng, Sukthankar (bib0003) 2018 Oneata, Verbeek, Schmid (bib0015) 2014 Gao, Guo, Zhang, Xu, Shen (bib0009) 2017; 19 Soomro, Zamir, Shah (bib0023) 2012 Lin, Zhao, Su, Wang, Yang (bib0014) 2018 Simonyan, Zisserman (bib0021) 2014 Wu, Wang, Gao, Li (bib0029) 2018; 73 Xu, Shen, Yang, Shen, Li (bib0031) 2017; 26 Feichtenhofer, Pinz, Zisserman (bib0006) 2016 Richard, Gall (bib0016) 2016 Yu, Koltun (bib0034) 2015 Zhang, Yang, Zhang, Ji, Shen, Chua (bib0039) 2019; 28 F. Chollet, et al., Keras, 2015. Shou, Wang, Chang (bib0020) 2016 Wang, Lin, Wu, Zhang (bib0028) 2017; 26 Zhao, Xiong, Wang, Wu, Tang, Lin (bib0040) 2017; 2 Yeung, Russakovsky, Mori, Fei-Fei (bib0033) 2016 Y. Jiang, J. Liu, A.R. Zamir, G. Toderici, I. Laptev, M. Shah, R. Sukthankar, Thumos challenge: Action recognition with a large number of classes, 2014. Song, Guo, Gao, Li, Hanjalic, Shen (bib0022) 2018 Yuan, Stroud, Lu, Deng (bib0037) 2017; 2 Yu, Yang, Huang, Wang, Song, Shen (bib0036) 2016; 25 Kingma, Ba (bib0012) 2014 Tran, Bourdev, Fergus, Torresani, Paluri (bib0024) 2015 10.1016/j.patrec.2019.02.007_bib0004 Wang (10.1016/j.patrec.2019.02.007_bib0025) 2013 Shou (10.1016/j.patrec.2019.02.007_bib0019) 2017 Wang (10.1016/j.patrec.2019.02.007_bib0027) 2017 Song (10.1016/j.patrec.2019.02.007_bib0022) 2018 Tran (10.1016/j.patrec.2019.02.007_bib0024) 2015 Bin (10.1016/j.patrec.2019.02.007_bib0001) 2018 Ioffe (10.1016/j.patrec.2019.02.007_bib0010) 2015; Volume 37 Soomro (10.1016/j.patrec.2019.02.007_bib0023) 2012 Shou (10.1016/j.patrec.2019.02.007_bib0020) 2016 Zhang (10.1016/j.patrec.2019.02.007_bib0038) 2018; 145 Zhao (10.1016/j.patrec.2019.02.007_bib0040) 2017; 2 Richard (10.1016/j.patrec.2019.02.007_bib0016) 2016 Wang (10.1016/j.patrec.2019.02.007_bib0026) 2014; 1 Yeung (10.1016/j.patrec.2019.02.007_bib0033) 2016 Chao (10.1016/j.patrec.2019.02.007_bib0003) 2018 Lin (10.1016/j.patrec.2019.02.007_bib0013) 2017 Xu (10.1016/j.patrec.2019.02.007_bib0031) 2017; 26 Gao (10.1016/j.patrec.2019.02.007_bib0009) 2017; 19 Oneata (10.1016/j.patrec.2019.02.007_bib0015) 2014 Gao (10.1016/j.patrec.2019.02.007_bib0008) 2017 Simonyan (10.1016/j.patrec.2019.02.007_bib0021) 2014 Yuan (10.1016/j.patrec.2019.02.007_bib0037) 2017; 2 Wu (10.1016/j.patrec.2019.02.007_bib0029) 2018; 73 Yu (10.1016/j.patrec.2019.02.007_bib0036) 2016; 25 Yang (10.1016/j.patrec.2019.02.007_bib0032) 2018 Gao (10.1016/j.patrec.2019.02.007_bib0007) 2017 Dai (10.1016/j.patrec.2019.02.007_bib0005) 2017 Roerdink (10.1016/j.patrec.2019.02.007_bib0017) 2000; 41 Kingma (10.1016/j.patrec.2019.02.007_bib0012) 2014 Yu (10.1016/j.patrec.2019.02.007_bib0035) 2017; 2 Feichtenhofer (10.1016/j.patrec.2019.02.007_bib0006) 2016 Yu (10.1016/j.patrec.2019.02.007_bib0034) 2015 Lin (10.1016/j.patrec.2019.02.007_bib0014) 2018 Wang (10.1016/j.patrec.2019.02.007_bib0028) 2017; 26 Shen (10.1016/j.patrec.2019.02.007_bib0018) 2018 Xu (10.1016/j.patrec.2019.02.007_bib0030) 2017 Zhang (10.1016/j.patrec.2019.02.007_bib0039) 2019; 28 Caba Heilbron (10.1016/j.patrec.2019.02.007_bib0002) 2016 10.1016/j.patrec.2019.02.007_bib0011 |
References_xml | – start-page: 1130 year: 2018 end-page: 1139 ident: bib0003 article-title: Rethinking the faster r-cnn architecture for temporal action localization publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition contributor: fullname: Sukthankar – start-page: 1049 year: 2016 end-page: 1058 ident: bib0020 article-title: Temporal action localization in untrimmed videos via multi-stage cnns publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition contributor: fullname: Chang – volume: 41 start-page: 187 year: 2000 end-page: 228 ident: bib0017 article-title: The watershed transform: definitions, algorithms and parallelization strategies publication-title: Fundam. Inform. contributor: fullname: Meijster – start-page: 1417 year: 2017 end-page: 1426 ident: bib0019 article-title: Cdc: Convolutional-de-convolutional networks for precise temporal action localization in untrimmed videos publication-title: Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on contributor: fullname: Chang – year: 2014 ident: bib0012 article-title: Adam: a method for stochastic optimization publication-title: arXiv preprint arXiv:1412.6980 contributor: fullname: Ba – start-page: 2678 year: 2016 end-page: 2687 ident: bib0033 article-title: End-to-end learning of action detection from frame glimpses in videos publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition contributor: fullname: Fei-Fei – start-page: 1914 year: 2016 end-page: 1923 ident: bib0002 article-title: Fast temporal activity proposals for efficient detection of human actions in untrimmed videos publication-title: Proceedings of the IEEE conference on computer vision and pattern recognition contributor: fullname: Ghanem – start-page: 5727 year: 2017 end-page: 5736 ident: bib0005 article-title: Temporal context network for activity localization in videos publication-title: Computer Vision (ICCV), 2017 IEEE International Conference on contributor: fullname: Chen – year: 2015 ident: bib0034 article-title: Multi-scale context aggregation by dilated convolutions publication-title: arXiv preprint arXiv:1511.07122 contributor: fullname: Koltun – volume: 145 start-page: 137 year: 2018 end-page: 145 ident: bib0038 article-title: Recurrent attention network using spatial-temporal relations for action recognition publication-title: Signal Processing contributor: fullname: Shen – volume: 26 start-page: 1393 year: 2017 end-page: 1404 ident: bib0028 article-title: Effective multi-query expansions: collaborative deep networks for robust landmark retrieval publication-title: IEEE Trans. Image Process. contributor: fullname: Zhang – year: 2018 ident: bib0001 article-title: Describing video with attention based bidirectional lstm publication-title: IEEE Trans. Cybern. contributor: fullname: Li – volume: 19 start-page: 2045 year: 2017 end-page: 2055 ident: bib0009 article-title: Video captioning with attention-based LSTM and semantic consistency publication-title: IEEE Trans. Multimedia contributor: fullname: Shen – year: 2018 ident: bib0014 article-title: Bsn: boundary sensitive network for temporal action proposal generation publication-title: arXiv preprint arXiv:1806.02964 contributor: fullname: Yang – year: 2018 ident: bib0018 article-title: Unsupervised deep hashing with similarity-adaptive and discrete optimization contributor: fullname: Shen – year: 2012 ident: bib0023 article-title: Ucf101: A dataset of 101 human actions classes from videos in the wild publication-title: arXiv preprint arXiv:1212.0402 contributor: fullname: Shah – volume: 26 start-page: 2494 year: 2017 end-page: 2507 ident: bib0031 article-title: Learning discriminative binary codes for large-scale cross-modal retrieval publication-title: IEEE Trans. Image Process. contributor: fullname: Li – start-page: 568 year: 2014 end-page: 576 ident: bib0021 article-title: Two-stream convolutional networks for action recognition in videos publication-title: Advances in neural information processing systems contributor: fullname: Zisserman – volume: Volume 37 start-page: 448 year: 2015 end-page: 456 ident: bib0010 article-title: Batch normalization: accelerating deep network training by reducing internal covariate shift publication-title: Proceedings of the International Conference on Machine Learning contributor: fullname: Szegedy – volume: 2 year: 2017 ident: bib0040 article-title: Temporal action detection with structured segment networks publication-title: ICCV, Oct contributor: fullname: Lin – start-page: 988 year: 2017 end-page: 996 ident: bib0013 article-title: Single shot temporal action detection publication-title: Proceedings of the 2017 ACM on Multimedia Conference contributor: fullname: Shou – year: 2018 ident: bib0022 article-title: From deterministic to generative: multimodal stochastic rnns for video captioning publication-title: IEEE Trans. Neural Netw. Learn. Syst. contributor: fullname: Shen – start-page: 5794 year: 2017 end-page: 5803 ident: bib0030 article-title: R-c3d: region convolutional 3d network for temporal activity detection publication-title: IEEE Int. Conf. on Computer Vision (ICCV) contributor: fullname: Saenko – year: 2018 ident: bib0032 article-title: Video captioning by adversarial lstm publication-title: IEEE Trans. Image Process. contributor: fullname: Shen – start-page: 1933 year: 2016 end-page: 1941 ident: bib0006 article-title: Convolutional two-stream network fusion for video action recognition publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition contributor: fullname: Zisserman – start-page: 3131 year: 2016 end-page: 3140 ident: bib0016 article-title: Temporal action detection using a statistical language model publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition contributor: fullname: Gall – start-page: 3551 year: 2013 end-page: 3558 ident: bib0025 article-title: Action recognition with improved trajectories publication-title: Proceedings of the IEEE international conference on computer vision contributor: fullname: Schmid – year: 2017 ident: bib0027 article-title: Temporal segment networks for action recognition in videos publication-title: arXiv preprint arXiv:1705.02953 contributor: fullname: Van Gool – start-page: 4489 year: 2015 end-page: 4497 ident: bib0024 article-title: Learning spatiotemporal features with 3d convolutional networks publication-title: Proceedings of the IEEE international conference on computer vision contributor: fullname: Paluri – volume: 1 start-page: 2 year: 2014 ident: bib0026 article-title: Action recognition and detection by combining motion and appearance features publication-title: THUMOS14 Action Recognition Challenge contributor: fullname: Tang – volume: 2 start-page: 3 year: 2017 ident: bib0035 article-title: Dilated residual networks. publication-title: CVPR contributor: fullname: Funkhouser – volume: 25 start-page: 5689 year: 2016 end-page: 5701 ident: bib0036 article-title: Web video event recognition by semantic analysis from ubiquitous documents publication-title: IEEE Trans. Image Process. contributor: fullname: Shen – year: 2017 ident: bib0007 article-title: Cascaded boundary regression for temporal action detection publication-title: arXiv preprint arXiv:1705.01180 contributor: fullname: Nevatia – year: 2014 ident: bib0015 article-title: The lear submission at thumos 2014 publication-title: ECCV THUMOS Workshop contributor: fullname: Schmid – volume: 73 start-page: 275 year: 2018 end-page: 288 ident: bib0029 article-title: Deep adaptive feature embedding with local sample distributions for person re-identification publication-title: Pattern Recognit. contributor: fullname: Li – volume: 28 start-page: 32 year: 2019 end-page: 44 ident: bib0039 article-title: More is better: precise and detailed image captioning using online positive recall and missing concepts mining publication-title: IEEE Trans. Image Process. contributor: fullname: Chua – year: 2017 ident: bib0008 article-title: Turn tap: temporal unit regression network for temporal action proposals publication-title: arXiv preprint arXiv:1703.06189 contributor: fullname: Nevatia – volume: 2 start-page: 7 year: 2017 ident: bib0037 article-title: Temporal action localization by structured maximal sums. publication-title: CVPR contributor: fullname: Deng – start-page: 568 year: 2014 ident: 10.1016/j.patrec.2019.02.007_bib0021 article-title: Two-stream convolutional networks for action recognition in videos contributor: fullname: Simonyan – year: 2018 ident: 10.1016/j.patrec.2019.02.007_bib0032 article-title: Video captioning by adversarial lstm publication-title: IEEE Trans. Image Process. doi: 10.1109/TIP.2018.2855422 contributor: fullname: Yang – start-page: 3131 year: 2016 ident: 10.1016/j.patrec.2019.02.007_bib0016 article-title: Temporal action detection using a statistical language model contributor: fullname: Richard – volume: 26 start-page: 2494 issue: 5 year: 2017 ident: 10.1016/j.patrec.2019.02.007_bib0031 article-title: Learning discriminative binary codes for large-scale cross-modal retrieval publication-title: IEEE Trans. Image Process. doi: 10.1109/TIP.2017.2676345 contributor: fullname: Xu – start-page: 3551 year: 2013 ident: 10.1016/j.patrec.2019.02.007_bib0025 article-title: Action recognition with improved trajectories contributor: fullname: Wang – volume: Volume 37 start-page: 448 year: 2015 ident: 10.1016/j.patrec.2019.02.007_bib0010 article-title: Batch normalization: accelerating deep network training by reducing internal covariate shift contributor: fullname: Ioffe – volume: 28 start-page: 32 issue: 1 year: 2019 ident: 10.1016/j.patrec.2019.02.007_bib0039 article-title: More is better: precise and detailed image captioning using online positive recall and missing concepts mining publication-title: IEEE Trans. Image Process. doi: 10.1109/TIP.2018.2855415 contributor: fullname: Zhang – year: 2012 ident: 10.1016/j.patrec.2019.02.007_bib0023 article-title: Ucf101: A dataset of 101 human actions classes from videos in the wild publication-title: arXiv preprint arXiv:1212.0402 contributor: fullname: Soomro – ident: 10.1016/j.patrec.2019.02.007_bib0011 – volume: 2 year: 2017 ident: 10.1016/j.patrec.2019.02.007_bib0040 article-title: Temporal action detection with structured segment networks publication-title: ICCV, Oct contributor: fullname: Zhao – volume: 19 start-page: 2045 issue: 9 year: 2017 ident: 10.1016/j.patrec.2019.02.007_bib0009 article-title: Video captioning with attention-based LSTM and semantic consistency publication-title: IEEE Trans. Multimedia doi: 10.1109/TMM.2017.2729019 contributor: fullname: Gao – start-page: 1049 year: 2016 ident: 10.1016/j.patrec.2019.02.007_bib0020 article-title: Temporal action localization in untrimmed videos via multi-stage cnns contributor: fullname: Shou – start-page: 4489 year: 2015 ident: 10.1016/j.patrec.2019.02.007_bib0024 article-title: Learning spatiotemporal features with 3d convolutional networks contributor: fullname: Tran – volume: 26 start-page: 1393 year: 2017 ident: 10.1016/j.patrec.2019.02.007_bib0028 article-title: Effective multi-query expansions: collaborative deep networks for robust landmark retrieval publication-title: IEEE Trans. Image Process. doi: 10.1109/TIP.2017.2655449 contributor: fullname: Wang – start-page: 5794 year: 2017 ident: 10.1016/j.patrec.2019.02.007_bib0030 article-title: R-c3d: region convolutional 3d network for temporal activity detection contributor: fullname: Xu – volume: 2 start-page: 7 year: 2017 ident: 10.1016/j.patrec.2019.02.007_bib0037 article-title: Temporal action localization by structured maximal sums. contributor: fullname: Yuan – year: 2018 ident: 10.1016/j.patrec.2019.02.007_bib0018 contributor: fullname: Shen – start-page: 988 year: 2017 ident: 10.1016/j.patrec.2019.02.007_bib0013 article-title: Single shot temporal action detection contributor: fullname: Lin – year: 2018 ident: 10.1016/j.patrec.2019.02.007_bib0014 article-title: Bsn: boundary sensitive network for temporal action proposal generation publication-title: arXiv preprint arXiv:1806.02964 contributor: fullname: Lin – volume: 2 start-page: 3 year: 2017 ident: 10.1016/j.patrec.2019.02.007_bib0035 article-title: Dilated residual networks. contributor: fullname: Yu – year: 2017 ident: 10.1016/j.patrec.2019.02.007_bib0007 article-title: Cascaded boundary regression for temporal action detection publication-title: arXiv preprint arXiv:1705.01180 contributor: fullname: Gao – year: 2018 ident: 10.1016/j.patrec.2019.02.007_bib0001 article-title: Describing video with attention based bidirectional lstm publication-title: IEEE Trans. Cybern. doi: 10.1109/TCYB.2018.2831447 contributor: fullname: Bin – year: 2018 ident: 10.1016/j.patrec.2019.02.007_bib0022 article-title: From deterministic to generative: multimodal stochastic rnns for video captioning publication-title: IEEE Trans. Neural Netw. Learn. Syst. doi: 10.1109/TNNLS.2018.2851077 contributor: fullname: Song – year: 2014 ident: 10.1016/j.patrec.2019.02.007_bib0012 article-title: Adam: a method for stochastic optimization publication-title: arXiv preprint arXiv:1412.6980 contributor: fullname: Kingma – volume: 41 start-page: 187 issue: 1, 2 year: 2000 ident: 10.1016/j.patrec.2019.02.007_bib0017 article-title: The watershed transform: definitions, algorithms and parallelization strategies publication-title: Fundam. Inform. doi: 10.3233/FI-2000-411207 contributor: fullname: Roerdink – volume: 73 start-page: 275 year: 2018 ident: 10.1016/j.patrec.2019.02.007_bib0029 article-title: Deep adaptive feature embedding with local sample distributions for person re-identification publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2017.08.029 contributor: fullname: Wu – start-page: 1130 year: 2018 ident: 10.1016/j.patrec.2019.02.007_bib0003 article-title: Rethinking the faster r-cnn architecture for temporal action localization contributor: fullname: Chao – start-page: 1933 year: 2016 ident: 10.1016/j.patrec.2019.02.007_bib0006 article-title: Convolutional two-stream network fusion for video action recognition contributor: fullname: Feichtenhofer – start-page: 1417 year: 2017 ident: 10.1016/j.patrec.2019.02.007_bib0019 article-title: Cdc: Convolutional-de-convolutional networks for precise temporal action localization in untrimmed videos contributor: fullname: Shou – volume: 25 start-page: 5689 issue: 12 year: 2016 ident: 10.1016/j.patrec.2019.02.007_bib0036 article-title: Web video event recognition by semantic analysis from ubiquitous documents publication-title: IEEE Trans. Image Process. doi: 10.1109/TIP.2016.2614136 contributor: fullname: Yu – volume: 1 start-page: 2 issue: 2 year: 2014 ident: 10.1016/j.patrec.2019.02.007_bib0026 article-title: Action recognition and detection by combining motion and appearance features publication-title: THUMOS14 Action Recognition Challenge contributor: fullname: Wang – start-page: 2678 year: 2016 ident: 10.1016/j.patrec.2019.02.007_bib0033 article-title: End-to-end learning of action detection from frame glimpses in videos contributor: fullname: Yeung – volume: 145 start-page: 137 year: 2018 ident: 10.1016/j.patrec.2019.02.007_bib0038 article-title: Recurrent attention network using spatial-temporal relations for action recognition publication-title: Signal Processing doi: 10.1016/j.sigpro.2017.12.008 contributor: fullname: Zhang – year: 2015 ident: 10.1016/j.patrec.2019.02.007_bib0034 article-title: Multi-scale context aggregation by dilated convolutions publication-title: arXiv preprint arXiv:1511.07122 contributor: fullname: Yu – year: 2014 ident: 10.1016/j.patrec.2019.02.007_bib0015 article-title: The lear submission at thumos 2014 contributor: fullname: Oneata – year: 2017 ident: 10.1016/j.patrec.2019.02.007_bib0008 article-title: Turn tap: temporal unit regression network for temporal action proposals publication-title: arXiv preprint arXiv:1703.06189 contributor: fullname: Gao – year: 2017 ident: 10.1016/j.patrec.2019.02.007_bib0027 article-title: Temporal segment networks for action recognition in videos publication-title: arXiv preprint arXiv:1705.02953 contributor: fullname: Wang – ident: 10.1016/j.patrec.2019.02.007_bib0004 – start-page: 5727 year: 2017 ident: 10.1016/j.patrec.2019.02.007_bib0005 article-title: Temporal context network for activity localization in videos contributor: fullname: Dai – start-page: 1914 year: 2016 ident: 10.1016/j.patrec.2019.02.007_bib0002 article-title: Fast temporal activity proposals for efficient detection of human actions in untrimmed videos contributor: fullname: Caba Heilbron |
SSID | ssj0006398 |
Score | 2.3820972 |
Snippet | •We propose a network with temporal multi-layer dilated convolution for TAP.•We propose the unit level and proposal level multi-scale aggregation... Temporal action detection is a very challenging and valuable task for video analysis and applications. The detection results, to a great extent, rely on the... |
SourceID | proquest crossref elsevier |
SourceType | Aggregation Database Publisher |
StartPage | 60 |
SubjectTerms | Agglomeration Labeling Proposals |
Title | Multi-scale aggregation network for temporal action proposals |
URI | https://dx.doi.org/10.1016/j.patrec.2019.02.007 https://www.proquest.com/docview/2218308997 |
Volume | 122 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1NSwMxEB1qe9GDH1WxWksOXmPb3WQ_Dh5KsVTFXrTQW9jsZktFarH16m93ZpOICiJ43IUsy2Qy700y8wJwoYVEIhoGPEI05yI3GddIjLkuiqg0IhOBof2O-0k0norbmZzVYOh7Yais0sV-G9OraO3edJ01u6vFovtABfTUVolOiTgnMG9vIBzRWW1jcHM3nnwGZAThxEt80wDfQVeVedGWsyEtw35qxTvj3xDqR6yuAGi0D7uOObKB_bkDqJllE_b8rQzMLdIm7HyRGDyEq6rDlq9xKgzL5phdz6u5YEtb_82QtDKnT_XMbJcDW9HVCWt0zCOYjq4fh2PurkzgeRiKDY-lznUepjLTQaJz5ALSIGbncWZiIzEbKIs06-UyJUGjJKRsKikRvZAUlDJIyvAY6suXpTkBlmhkGkWiI5FmQkcGqYPOkKCRwFoR9U0LuDeTWlllDOVLxp6UNasis6peoNCsLYi9LdW3GVYYvP8Y2famV26FrVVA3I7OLOPTf3_4DLbpydYvtqG-eX0z58gxNroDW5fv_Y7zpA9thc-m |
link.rule.ids | 314,780,784,4502,24116,27924,27925,45585,45679 |
linkProvider | Elsevier |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LTwIxEJ4QPKgHH6gRRe3BawV2230cPBgiQQUuQsKtaXe7BGNWInj1tzvdtvGRGBOv-8rm63Tmm3bmK8ClYhyJaBjQCKM5ZZmWVCExpirPo0IzyQJt1jtG42gwZfczPqtBz_fCmLJK5_utT6-8tbvSdmi2l4tF-9EU0Ju2SjRKjHMM8_YNxpH9olFfvX_WeWAITrzAt3nc989VRV5mwVkbJcNuaqU749_i0w9PXYWf_h7sON5Ibuyv7UNNlw3Y9WcyEDdFG7D9RWDwAK6r_lq6woHQRM4xt55XI0FKW_1NkLISp071TGyPA1magxNWaJaHMO3fTnoD6g5MoFkYsjWNucpUFqZcqiBRGTIBrjFiZ7HUseaYCxR5KjsZT42cURKaXCopMHYhJSh4kBThEdTLl1IfA0kU8ow8URFLJVORRuKgJNIzI6-WR13dBOphEkuriyF8wdiTsLAKA6voBAJhbULssRTfxleg6_7jzZaHXrj5tRKBYXZmxzI--feHL2BzMBkNxfBu_HAKW-aOrWRsQX39-qbPkG2s1XllTR9JI9B_ |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Multi-scale+aggregation+network+for+temporal+action+proposals&rft.jtitle=Pattern+recognition+letters&rft.au=Wang%2C+Zheng&rft.au=Chen%2C+Kai&rft.au=Zhang%2C+Mingxing&rft.au=He%2C+Peilin&rft.date=2019-05-01&rft.pub=Elsevier+B.V&rft.issn=0167-8655&rft.eissn=1872-7344&rft.volume=122&rft.spage=60&rft.epage=65&rft_id=info:doi/10.1016%2Fj.patrec.2019.02.007&rft.externalDocID=S0167865518307840 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0167-8655&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0167-8655&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0167-8655&client=summon |