Multi-scale aggregation network for temporal action proposals

•We propose a network with temporal multi-layer dilated convolution for TAP.•We propose the unit level and proposal level multi-scale aggregation strategies.•We propose to take the soft labelling to facilitate action boundary unit prediction. Temporal action detection is a very challenging and valua...

Full description

Saved in:
Bibliographic Details
Published inPattern recognition letters Vol. 122; pp. 60 - 65
Main Authors Wang, Zheng, Chen, Kai, Zhang, Mingxing, He, Peilin, Wang, Yajie, Zhu, Ping, Yang, Yang
Format Journal Article
LanguageEnglish
Published Amsterdam Elsevier B.V 01.05.2019
Elsevier Science Ltd
Subjects
Online AccessGet full text

Cover

Loading…
Abstract •We propose a network with temporal multi-layer dilated convolution for TAP.•We propose the unit level and proposal level multi-scale aggregation strategies.•We propose to take the soft labelling to facilitate action boundary unit prediction. Temporal action detection is a very challenging and valuable task for video analysis and applications. The detection results, to a great extent, rely on the quality of temporal action proposals. However, temporal actions in videos vary dramatically, e.g. from a fraction of a second to minutes, which causes much difficulties for accurate temporal action proposals. In this paper, we propose a multi-scale aggregation network to overcome those variations for temporal action proposals. Our proposed network generates an actionness score sequence for the input video to automatically perceive the duration of actions, and thus can dynamically generate corresponding lengths of action proposals for them. For more reliable actionness prediction, we propose to adaptively explore the intrinsic short and long dependencies in action by two multi-scale aggregation strategies: unit level multi-scale aggregation and proposal level multi-scale aggregation. We also propose to take the soft labelling to facilitate the actionness prediction for the units near the action boundaries. Extensive experiments on THUMOS14 dataset have demonstrated the effectiveness of our proposed method.
AbstractList Temporal action detection is a very challenging and valuable task for video analysis and applications. The detection results, to a great extent, rely on the quality of temporal action proposals. However, temporal actions in videos vary dramatically, e.g. from a fraction of a second to minutes, which causes much difficulties for accurate temporal action proposals. In this paper, we propose a multi-scale aggregation network to overcome those variations for temporal action proposals. Our proposed network generates an actionness score sequence for the input video to automatically perceive the duration of actions, and thus can dynamically generate corresponding lengths of action proposals for them. For more reliable actionness prediction, we propose to adaptively explore the intrinsic short and long dependencies in action by two multi-scale aggregation strategies: unit level multi-scale aggregation and proposal level multi-scale aggregation. We also propose to take the soft labelling to facilitate the actionness prediction for the units near the action boundaries. Extensive experiments on THUMOS14 dataset have demonstrated the effectiveness of our proposed method.
•We propose a network with temporal multi-layer dilated convolution for TAP.•We propose the unit level and proposal level multi-scale aggregation strategies.•We propose to take the soft labelling to facilitate action boundary unit prediction. Temporal action detection is a very challenging and valuable task for video analysis and applications. The detection results, to a great extent, rely on the quality of temporal action proposals. However, temporal actions in videos vary dramatically, e.g. from a fraction of a second to minutes, which causes much difficulties for accurate temporal action proposals. In this paper, we propose a multi-scale aggregation network to overcome those variations for temporal action proposals. Our proposed network generates an actionness score sequence for the input video to automatically perceive the duration of actions, and thus can dynamically generate corresponding lengths of action proposals for them. For more reliable actionness prediction, we propose to adaptively explore the intrinsic short and long dependencies in action by two multi-scale aggregation strategies: unit level multi-scale aggregation and proposal level multi-scale aggregation. We also propose to take the soft labelling to facilitate the actionness prediction for the units near the action boundaries. Extensive experiments on THUMOS14 dataset have demonstrated the effectiveness of our proposed method.
Author Zhang, Mingxing
Wang, Zheng
Chen, Kai
He, Peilin
Wang, Yajie
Zhu, Ping
Yang, Yang
Author_xml – sequence: 1
  givenname: Zheng
  surname: Wang
  fullname: Wang, Zheng
  organization: University of Electronic Science and Technology of China, No.2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu, 611731, China
– sequence: 2
  givenname: Kai
  surname: Chen
  fullname: Chen, Kai
  email: chenkai@gzbdi.com
  organization: Guizhou Food Safety Inspection and Application Engineering Technology Research Center Co.,Ltd, Changling South Road, National High-tech Zone, Guiyang, 550002, China
– sequence: 3
  givenname: Mingxing
  surname: Zhang
  fullname: Zhang, Mingxing
  organization: University of Electronic Science and Technology of China, No.2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu, 611731, China
– sequence: 4
  givenname: Peilin
  surname: He
  fullname: He, Peilin
  email: peilinhe@qq.com
  organization: University of Electronic Science and Technology of China, No.2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu, 611731, China
– sequence: 5
  givenname: Yajie
  surname: Wang
  fullname: Wang, Yajie
  email: wangyajie@gzbdi.com
  organization: Guizhou Food Safety Inspection and Application Engineering Technology Research Center Co.,Ltd, Changling South Road, National High-tech Zone, Guiyang, 550002, China
– sequence: 6
  givenname: Ping
  surname: Zhu
  fullname: Zhu, Ping
  email: lange@gzata.cn
  organization: Guizhou Food Safety Inspection and Application Engineering Technology Research Center Co.,Ltd, Changling South Road, National High-tech Zone, Guiyang, 550002, China
– sequence: 7
  givenname: Yang
  orcidid: 0000-0002-5070-4511
  surname: Yang
  fullname: Yang, Yang
  email: dlyyang@gmail.com
  organization: University of Electronic Science and Technology of China, No.2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu, 611731, China
BookMark eNp9kLtOwzAUhi1UJNrCGzBEYk7wtbYHkFDFTSpigdly3JMqIY2D7YJ4ewxhZvqH8190vgWaDX4AhM4Jrggmq8uuGm0K4CqKia4wrTCWR2hOlKSlZJzP0DzbZKlWQpygRYwdxnjFtJqjq6dDn9oyOttDYXe7ADubWj8UA6RPH96KxociwX70wfaFdb-3MfjRR9vHU3TcZIGzP12i17vbl_VDuXm-f1zfbErHGE-lFLWrHdPC1lTVThMsgHLspAUJAnPdbLXFTmgmmVKMcE5UQ0n-RjSCqoYt0cXUm5ffDxCT6fwhDHnSUEoUw0prmV18crngYwzQmDG0exu-DMHmB5TpzATK_IAymJoMKseupxjkDz5aCCa6FgYH2zZbk9n69v-Cb-E1dEw
CitedBy_id crossref_primary_10_1016_j_ipm_2019_102130
crossref_primary_10_1109_TMM_2021_3053775
crossref_primary_10_1007_s11280_022_01058_7
crossref_primary_10_1007_s13042_023_01774_0
crossref_primary_10_1016_j_jvcir_2020_102934
Cites_doi 10.1109/TIP.2018.2855422
10.1109/TIP.2017.2676345
10.1109/TIP.2018.2855415
10.1109/TMM.2017.2729019
10.1109/TIP.2017.2655449
10.1109/TCYB.2018.2831447
10.1109/TNNLS.2018.2851077
10.3233/FI-2000-411207
10.1016/j.patcog.2017.08.029
10.1109/TIP.2016.2614136
10.1016/j.sigpro.2017.12.008
ContentType Journal Article
Copyright 2019
Copyright Elsevier Science Ltd. May 1, 2019
Copyright_xml – notice: 2019
– notice: Copyright Elsevier Science Ltd. May 1, 2019
DBID AAYXX
CITATION
7SC
7TK
8FD
JQ2
L7M
L~C
L~D
DOI 10.1016/j.patrec.2019.02.007
DatabaseName CrossRef
Computer and Information Systems Abstracts
Neurosciences Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Neurosciences Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1872-7344
EndPage 65
ExternalDocumentID 10_1016_j_patrec_2019_02_007
S0167865518307840
GroupedDBID --M
.DC
.~1
0R~
123
1RT
1~.
1~5
4.4
457
4G.
53G
5VS
7-5
71M
8P~
9JN
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAXUO
AAYFN
ABBOA
ABFNM
ABFRF
ABJNI
ABMAC
ABYKQ
ACDAQ
ACGFO
ACGFS
ACRLP
ACZNC
ADBBV
ADEZE
ADTZH
AEBSH
AECPX
AEFWE
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
AXJTR
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EO8
EO9
EP2
EP3
F5P
FDB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
GBOLZ
J1W
JJJVA
KOM
LG9
LY1
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
RIG
RNS
ROL
SDF
SDG
SDP
SES
SPC
SPCBC
SST
SSV
SSZ
T5K
TN5
UNMZH
WH7
XPP
ZMT
~G-
--K
1B1
29O
AAQXK
AAXKI
AAYXX
ABDPE
ABXDB
ACNNM
ACRPL
ADJOM
ADMUD
ADMXK
ADNMO
AFJKZ
AKRWK
ASPBG
AVWKF
AZFZN
CITATION
EJD
FEDTE
FGOYB
HLZ
HVGLF
HZ~
IHE
R2-
RPZ
SBC
SDS
SEW
VOH
WUQ
Y6R
7SC
7TK
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c334t-75bcbc395ab28bc9105e240c7ae7e5049fd9a0c5937388314418f212015f528f3
IEDL.DBID AIKHN
ISSN 0167-8655
IngestDate Thu Oct 10 22:01:02 EDT 2024
Fri Dec 06 04:03:50 EST 2024
Fri Feb 23 02:24:34 EST 2024
IsPeerReviewed true
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c334t-75bcbc395ab28bc9105e240c7ae7e5049fd9a0c5937388314418f212015f528f3
ORCID 0000-0002-5070-4511
PQID 2218308997
PQPubID 2047552
PageCount 6
ParticipantIDs proquest_journals_2218308997
crossref_primary_10_1016_j_patrec_2019_02_007
elsevier_sciencedirect_doi_10_1016_j_patrec_2019_02_007
PublicationCentury 2000
PublicationDate 2019-05-01
2019-05-00
20190501
PublicationDateYYYYMMDD 2019-05-01
PublicationDate_xml – month: 05
  year: 2019
  text: 2019-05-01
  day: 01
PublicationDecade 2010
PublicationPlace Amsterdam
PublicationPlace_xml – name: Amsterdam
PublicationTitle Pattern recognition letters
PublicationYear 2019
Publisher Elsevier B.V
Elsevier Science Ltd
Publisher_xml – name: Elsevier B.V
– name: Elsevier Science Ltd
References Shen, Xu, Liu, Yang, Huang, Shen (bib0018) 2018
Roerdink, Meijster (bib0017) 2000; 41
Wang, Schmid (bib0025) 2013
Caba Heilbron, Carlos Niebles, Ghanem (bib0002) 2016
Gao, Yang, Nevatia (bib0007) 2017
Ioffe, Szegedy (bib0010) 2015; Volume 37
Wang, Qiao, Tang (bib0026) 2014; 1
Yang, Zhou, Ai, Bin, Hanjalic, Shen (bib0032) 2018
Zhang, Yang, Ji, Xie, Shen (bib0038) 2018; 145
Dai, Singh, Zhang, Davis, Chen (bib0005) 2017
Shou, Chan, Zareian, Miyazawa, Chang (bib0019) 2017
Yu, Koltun, Funkhouser (bib0035) 2017; 2
Wang, Xiong, Wang, Qiao, Lin, Tang, Van Gool (bib0027) 2017
Lin, Zhao, Shou (bib0013) 2017
Gao, Yang, Sun, Chen, Nevatia (bib0008) 2017
Xu, Das, Saenko (bib0030) 2017
Bin, Yang, Shen, Xie, Shen, Li (bib0001) 2018
Chao, Vijayanarasimhan, Seybold, Ross, Deng, Sukthankar (bib0003) 2018
Oneata, Verbeek, Schmid (bib0015) 2014
Gao, Guo, Zhang, Xu, Shen (bib0009) 2017; 19
Soomro, Zamir, Shah (bib0023) 2012
Lin, Zhao, Su, Wang, Yang (bib0014) 2018
Simonyan, Zisserman (bib0021) 2014
Wu, Wang, Gao, Li (bib0029) 2018; 73
Xu, Shen, Yang, Shen, Li (bib0031) 2017; 26
Feichtenhofer, Pinz, Zisserman (bib0006) 2016
Richard, Gall (bib0016) 2016
Yu, Koltun (bib0034) 2015
Zhang, Yang, Zhang, Ji, Shen, Chua (bib0039) 2019; 28
F. Chollet, et al., Keras, 2015.
Shou, Wang, Chang (bib0020) 2016
Wang, Lin, Wu, Zhang (bib0028) 2017; 26
Zhao, Xiong, Wang, Wu, Tang, Lin (bib0040) 2017; 2
Yeung, Russakovsky, Mori, Fei-Fei (bib0033) 2016
Y. Jiang, J. Liu, A.R. Zamir, G. Toderici, I. Laptev, M. Shah, R. Sukthankar, Thumos challenge: Action recognition with a large number of classes, 2014.
Song, Guo, Gao, Li, Hanjalic, Shen (bib0022) 2018
Yuan, Stroud, Lu, Deng (bib0037) 2017; 2
Yu, Yang, Huang, Wang, Song, Shen (bib0036) 2016; 25
Kingma, Ba (bib0012) 2014
Tran, Bourdev, Fergus, Torresani, Paluri (bib0024) 2015
10.1016/j.patrec.2019.02.007_bib0004
Wang (10.1016/j.patrec.2019.02.007_bib0025) 2013
Shou (10.1016/j.patrec.2019.02.007_bib0019) 2017
Wang (10.1016/j.patrec.2019.02.007_bib0027) 2017
Song (10.1016/j.patrec.2019.02.007_bib0022) 2018
Tran (10.1016/j.patrec.2019.02.007_bib0024) 2015
Bin (10.1016/j.patrec.2019.02.007_bib0001) 2018
Ioffe (10.1016/j.patrec.2019.02.007_bib0010) 2015; Volume 37
Soomro (10.1016/j.patrec.2019.02.007_bib0023) 2012
Shou (10.1016/j.patrec.2019.02.007_bib0020) 2016
Zhang (10.1016/j.patrec.2019.02.007_bib0038) 2018; 145
Zhao (10.1016/j.patrec.2019.02.007_bib0040) 2017; 2
Richard (10.1016/j.patrec.2019.02.007_bib0016) 2016
Wang (10.1016/j.patrec.2019.02.007_bib0026) 2014; 1
Yeung (10.1016/j.patrec.2019.02.007_bib0033) 2016
Chao (10.1016/j.patrec.2019.02.007_bib0003) 2018
Lin (10.1016/j.patrec.2019.02.007_bib0013) 2017
Xu (10.1016/j.patrec.2019.02.007_bib0031) 2017; 26
Gao (10.1016/j.patrec.2019.02.007_bib0009) 2017; 19
Oneata (10.1016/j.patrec.2019.02.007_bib0015) 2014
Gao (10.1016/j.patrec.2019.02.007_bib0008) 2017
Simonyan (10.1016/j.patrec.2019.02.007_bib0021) 2014
Yuan (10.1016/j.patrec.2019.02.007_bib0037) 2017; 2
Wu (10.1016/j.patrec.2019.02.007_bib0029) 2018; 73
Yu (10.1016/j.patrec.2019.02.007_bib0036) 2016; 25
Yang (10.1016/j.patrec.2019.02.007_bib0032) 2018
Gao (10.1016/j.patrec.2019.02.007_bib0007) 2017
Dai (10.1016/j.patrec.2019.02.007_bib0005) 2017
Roerdink (10.1016/j.patrec.2019.02.007_bib0017) 2000; 41
Kingma (10.1016/j.patrec.2019.02.007_bib0012) 2014
Yu (10.1016/j.patrec.2019.02.007_bib0035) 2017; 2
Feichtenhofer (10.1016/j.patrec.2019.02.007_bib0006) 2016
Yu (10.1016/j.patrec.2019.02.007_bib0034) 2015
Lin (10.1016/j.patrec.2019.02.007_bib0014) 2018
Wang (10.1016/j.patrec.2019.02.007_bib0028) 2017; 26
Shen (10.1016/j.patrec.2019.02.007_bib0018) 2018
Xu (10.1016/j.patrec.2019.02.007_bib0030) 2017
Zhang (10.1016/j.patrec.2019.02.007_bib0039) 2019; 28
Caba Heilbron (10.1016/j.patrec.2019.02.007_bib0002) 2016
10.1016/j.patrec.2019.02.007_bib0011
References_xml – start-page: 1130
  year: 2018
  end-page: 1139
  ident: bib0003
  article-title: Rethinking the faster r-cnn architecture for temporal action localization
  publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
  contributor:
    fullname: Sukthankar
– start-page: 1049
  year: 2016
  end-page: 1058
  ident: bib0020
  article-title: Temporal action localization in untrimmed videos via multi-stage cnns
  publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
  contributor:
    fullname: Chang
– volume: 41
  start-page: 187
  year: 2000
  end-page: 228
  ident: bib0017
  article-title: The watershed transform: definitions, algorithms and parallelization strategies
  publication-title: Fundam. Inform.
  contributor:
    fullname: Meijster
– start-page: 1417
  year: 2017
  end-page: 1426
  ident: bib0019
  article-title: Cdc: Convolutional-de-convolutional networks for precise temporal action localization in untrimmed videos
  publication-title: Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on
  contributor:
    fullname: Chang
– year: 2014
  ident: bib0012
  article-title: Adam: a method for stochastic optimization
  publication-title: arXiv preprint arXiv:1412.6980
  contributor:
    fullname: Ba
– start-page: 2678
  year: 2016
  end-page: 2687
  ident: bib0033
  article-title: End-to-end learning of action detection from frame glimpses in videos
  publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
  contributor:
    fullname: Fei-Fei
– start-page: 1914
  year: 2016
  end-page: 1923
  ident: bib0002
  article-title: Fast temporal activity proposals for efficient detection of human actions in untrimmed videos
  publication-title: Proceedings of the IEEE conference on computer vision and pattern recognition
  contributor:
    fullname: Ghanem
– start-page: 5727
  year: 2017
  end-page: 5736
  ident: bib0005
  article-title: Temporal context network for activity localization in videos
  publication-title: Computer Vision (ICCV), 2017 IEEE International Conference on
  contributor:
    fullname: Chen
– year: 2015
  ident: bib0034
  article-title: Multi-scale context aggregation by dilated convolutions
  publication-title: arXiv preprint arXiv:1511.07122
  contributor:
    fullname: Koltun
– volume: 145
  start-page: 137
  year: 2018
  end-page: 145
  ident: bib0038
  article-title: Recurrent attention network using spatial-temporal relations for action recognition
  publication-title: Signal Processing
  contributor:
    fullname: Shen
– volume: 26
  start-page: 1393
  year: 2017
  end-page: 1404
  ident: bib0028
  article-title: Effective multi-query expansions: collaborative deep networks for robust landmark retrieval
  publication-title: IEEE Trans. Image Process.
  contributor:
    fullname: Zhang
– year: 2018
  ident: bib0001
  article-title: Describing video with attention based bidirectional lstm
  publication-title: IEEE Trans. Cybern.
  contributor:
    fullname: Li
– volume: 19
  start-page: 2045
  year: 2017
  end-page: 2055
  ident: bib0009
  article-title: Video captioning with attention-based LSTM and semantic consistency
  publication-title: IEEE Trans. Multimedia
  contributor:
    fullname: Shen
– year: 2018
  ident: bib0014
  article-title: Bsn: boundary sensitive network for temporal action proposal generation
  publication-title: arXiv preprint arXiv:1806.02964
  contributor:
    fullname: Yang
– year: 2018
  ident: bib0018
  article-title: Unsupervised deep hashing with similarity-adaptive and discrete optimization
  contributor:
    fullname: Shen
– year: 2012
  ident: bib0023
  article-title: Ucf101: A dataset of 101 human actions classes from videos in the wild
  publication-title: arXiv preprint arXiv:1212.0402
  contributor:
    fullname: Shah
– volume: 26
  start-page: 2494
  year: 2017
  end-page: 2507
  ident: bib0031
  article-title: Learning discriminative binary codes for large-scale cross-modal retrieval
  publication-title: IEEE Trans. Image Process.
  contributor:
    fullname: Li
– start-page: 568
  year: 2014
  end-page: 576
  ident: bib0021
  article-title: Two-stream convolutional networks for action recognition in videos
  publication-title: Advances in neural information processing systems
  contributor:
    fullname: Zisserman
– volume: Volume 37
  start-page: 448
  year: 2015
  end-page: 456
  ident: bib0010
  article-title: Batch normalization: accelerating deep network training by reducing internal covariate shift
  publication-title: Proceedings of the International Conference on Machine Learning
  contributor:
    fullname: Szegedy
– volume: 2
  year: 2017
  ident: bib0040
  article-title: Temporal action detection with structured segment networks
  publication-title: ICCV, Oct
  contributor:
    fullname: Lin
– start-page: 988
  year: 2017
  end-page: 996
  ident: bib0013
  article-title: Single shot temporal action detection
  publication-title: Proceedings of the 2017 ACM on Multimedia Conference
  contributor:
    fullname: Shou
– year: 2018
  ident: bib0022
  article-title: From deterministic to generative: multimodal stochastic rnns for video captioning
  publication-title: IEEE Trans. Neural Netw. Learn. Syst.
  contributor:
    fullname: Shen
– start-page: 5794
  year: 2017
  end-page: 5803
  ident: bib0030
  article-title: R-c3d: region convolutional 3d network for temporal activity detection
  publication-title: IEEE Int. Conf. on Computer Vision (ICCV)
  contributor:
    fullname: Saenko
– year: 2018
  ident: bib0032
  article-title: Video captioning by adversarial lstm
  publication-title: IEEE Trans. Image Process.
  contributor:
    fullname: Shen
– start-page: 1933
  year: 2016
  end-page: 1941
  ident: bib0006
  article-title: Convolutional two-stream network fusion for video action recognition
  publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
  contributor:
    fullname: Zisserman
– start-page: 3131
  year: 2016
  end-page: 3140
  ident: bib0016
  article-title: Temporal action detection using a statistical language model
  publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
  contributor:
    fullname: Gall
– start-page: 3551
  year: 2013
  end-page: 3558
  ident: bib0025
  article-title: Action recognition with improved trajectories
  publication-title: Proceedings of the IEEE international conference on computer vision
  contributor:
    fullname: Schmid
– year: 2017
  ident: bib0027
  article-title: Temporal segment networks for action recognition in videos
  publication-title: arXiv preprint arXiv:1705.02953
  contributor:
    fullname: Van Gool
– start-page: 4489
  year: 2015
  end-page: 4497
  ident: bib0024
  article-title: Learning spatiotemporal features with 3d convolutional networks
  publication-title: Proceedings of the IEEE international conference on computer vision
  contributor:
    fullname: Paluri
– volume: 1
  start-page: 2
  year: 2014
  ident: bib0026
  article-title: Action recognition and detection by combining motion and appearance features
  publication-title: THUMOS14 Action Recognition Challenge
  contributor:
    fullname: Tang
– volume: 2
  start-page: 3
  year: 2017
  ident: bib0035
  article-title: Dilated residual networks.
  publication-title: CVPR
  contributor:
    fullname: Funkhouser
– volume: 25
  start-page: 5689
  year: 2016
  end-page: 5701
  ident: bib0036
  article-title: Web video event recognition by semantic analysis from ubiquitous documents
  publication-title: IEEE Trans. Image Process.
  contributor:
    fullname: Shen
– year: 2017
  ident: bib0007
  article-title: Cascaded boundary regression for temporal action detection
  publication-title: arXiv preprint arXiv:1705.01180
  contributor:
    fullname: Nevatia
– year: 2014
  ident: bib0015
  article-title: The lear submission at thumos 2014
  publication-title: ECCV THUMOS Workshop
  contributor:
    fullname: Schmid
– volume: 73
  start-page: 275
  year: 2018
  end-page: 288
  ident: bib0029
  article-title: Deep adaptive feature embedding with local sample distributions for person re-identification
  publication-title: Pattern Recognit.
  contributor:
    fullname: Li
– volume: 28
  start-page: 32
  year: 2019
  end-page: 44
  ident: bib0039
  article-title: More is better: precise and detailed image captioning using online positive recall and missing concepts mining
  publication-title: IEEE Trans. Image Process.
  contributor:
    fullname: Chua
– year: 2017
  ident: bib0008
  article-title: Turn tap: temporal unit regression network for temporal action proposals
  publication-title: arXiv preprint arXiv:1703.06189
  contributor:
    fullname: Nevatia
– volume: 2
  start-page: 7
  year: 2017
  ident: bib0037
  article-title: Temporal action localization by structured maximal sums.
  publication-title: CVPR
  contributor:
    fullname: Deng
– start-page: 568
  year: 2014
  ident: 10.1016/j.patrec.2019.02.007_bib0021
  article-title: Two-stream convolutional networks for action recognition in videos
  contributor:
    fullname: Simonyan
– year: 2018
  ident: 10.1016/j.patrec.2019.02.007_bib0032
  article-title: Video captioning by adversarial lstm
  publication-title: IEEE Trans. Image Process.
  doi: 10.1109/TIP.2018.2855422
  contributor:
    fullname: Yang
– start-page: 3131
  year: 2016
  ident: 10.1016/j.patrec.2019.02.007_bib0016
  article-title: Temporal action detection using a statistical language model
  contributor:
    fullname: Richard
– volume: 26
  start-page: 2494
  issue: 5
  year: 2017
  ident: 10.1016/j.patrec.2019.02.007_bib0031
  article-title: Learning discriminative binary codes for large-scale cross-modal retrieval
  publication-title: IEEE Trans. Image Process.
  doi: 10.1109/TIP.2017.2676345
  contributor:
    fullname: Xu
– start-page: 3551
  year: 2013
  ident: 10.1016/j.patrec.2019.02.007_bib0025
  article-title: Action recognition with improved trajectories
  contributor:
    fullname: Wang
– volume: Volume 37
  start-page: 448
  year: 2015
  ident: 10.1016/j.patrec.2019.02.007_bib0010
  article-title: Batch normalization: accelerating deep network training by reducing internal covariate shift
  contributor:
    fullname: Ioffe
– volume: 28
  start-page: 32
  issue: 1
  year: 2019
  ident: 10.1016/j.patrec.2019.02.007_bib0039
  article-title: More is better: precise and detailed image captioning using online positive recall and missing concepts mining
  publication-title: IEEE Trans. Image Process.
  doi: 10.1109/TIP.2018.2855415
  contributor:
    fullname: Zhang
– year: 2012
  ident: 10.1016/j.patrec.2019.02.007_bib0023
  article-title: Ucf101: A dataset of 101 human actions classes from videos in the wild
  publication-title: arXiv preprint arXiv:1212.0402
  contributor:
    fullname: Soomro
– ident: 10.1016/j.patrec.2019.02.007_bib0011
– volume: 2
  year: 2017
  ident: 10.1016/j.patrec.2019.02.007_bib0040
  article-title: Temporal action detection with structured segment networks
  publication-title: ICCV, Oct
  contributor:
    fullname: Zhao
– volume: 19
  start-page: 2045
  issue: 9
  year: 2017
  ident: 10.1016/j.patrec.2019.02.007_bib0009
  article-title: Video captioning with attention-based LSTM and semantic consistency
  publication-title: IEEE Trans. Multimedia
  doi: 10.1109/TMM.2017.2729019
  contributor:
    fullname: Gao
– start-page: 1049
  year: 2016
  ident: 10.1016/j.patrec.2019.02.007_bib0020
  article-title: Temporal action localization in untrimmed videos via multi-stage cnns
  contributor:
    fullname: Shou
– start-page: 4489
  year: 2015
  ident: 10.1016/j.patrec.2019.02.007_bib0024
  article-title: Learning spatiotemporal features with 3d convolutional networks
  contributor:
    fullname: Tran
– volume: 26
  start-page: 1393
  year: 2017
  ident: 10.1016/j.patrec.2019.02.007_bib0028
  article-title: Effective multi-query expansions: collaborative deep networks for robust landmark retrieval
  publication-title: IEEE Trans. Image Process.
  doi: 10.1109/TIP.2017.2655449
  contributor:
    fullname: Wang
– start-page: 5794
  year: 2017
  ident: 10.1016/j.patrec.2019.02.007_bib0030
  article-title: R-c3d: region convolutional 3d network for temporal activity detection
  contributor:
    fullname: Xu
– volume: 2
  start-page: 7
  year: 2017
  ident: 10.1016/j.patrec.2019.02.007_bib0037
  article-title: Temporal action localization by structured maximal sums.
  contributor:
    fullname: Yuan
– year: 2018
  ident: 10.1016/j.patrec.2019.02.007_bib0018
  contributor:
    fullname: Shen
– start-page: 988
  year: 2017
  ident: 10.1016/j.patrec.2019.02.007_bib0013
  article-title: Single shot temporal action detection
  contributor:
    fullname: Lin
– year: 2018
  ident: 10.1016/j.patrec.2019.02.007_bib0014
  article-title: Bsn: boundary sensitive network for temporal action proposal generation
  publication-title: arXiv preprint arXiv:1806.02964
  contributor:
    fullname: Lin
– volume: 2
  start-page: 3
  year: 2017
  ident: 10.1016/j.patrec.2019.02.007_bib0035
  article-title: Dilated residual networks.
  contributor:
    fullname: Yu
– year: 2017
  ident: 10.1016/j.patrec.2019.02.007_bib0007
  article-title: Cascaded boundary regression for temporal action detection
  publication-title: arXiv preprint arXiv:1705.01180
  contributor:
    fullname: Gao
– year: 2018
  ident: 10.1016/j.patrec.2019.02.007_bib0001
  article-title: Describing video with attention based bidirectional lstm
  publication-title: IEEE Trans. Cybern.
  doi: 10.1109/TCYB.2018.2831447
  contributor:
    fullname: Bin
– year: 2018
  ident: 10.1016/j.patrec.2019.02.007_bib0022
  article-title: From deterministic to generative: multimodal stochastic rnns for video captioning
  publication-title: IEEE Trans. Neural Netw. Learn. Syst.
  doi: 10.1109/TNNLS.2018.2851077
  contributor:
    fullname: Song
– year: 2014
  ident: 10.1016/j.patrec.2019.02.007_bib0012
  article-title: Adam: a method for stochastic optimization
  publication-title: arXiv preprint arXiv:1412.6980
  contributor:
    fullname: Kingma
– volume: 41
  start-page: 187
  issue: 1, 2
  year: 2000
  ident: 10.1016/j.patrec.2019.02.007_bib0017
  article-title: The watershed transform: definitions, algorithms and parallelization strategies
  publication-title: Fundam. Inform.
  doi: 10.3233/FI-2000-411207
  contributor:
    fullname: Roerdink
– volume: 73
  start-page: 275
  year: 2018
  ident: 10.1016/j.patrec.2019.02.007_bib0029
  article-title: Deep adaptive feature embedding with local sample distributions for person re-identification
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2017.08.029
  contributor:
    fullname: Wu
– start-page: 1130
  year: 2018
  ident: 10.1016/j.patrec.2019.02.007_bib0003
  article-title: Rethinking the faster r-cnn architecture for temporal action localization
  contributor:
    fullname: Chao
– start-page: 1933
  year: 2016
  ident: 10.1016/j.patrec.2019.02.007_bib0006
  article-title: Convolutional two-stream network fusion for video action recognition
  contributor:
    fullname: Feichtenhofer
– start-page: 1417
  year: 2017
  ident: 10.1016/j.patrec.2019.02.007_bib0019
  article-title: Cdc: Convolutional-de-convolutional networks for precise temporal action localization in untrimmed videos
  contributor:
    fullname: Shou
– volume: 25
  start-page: 5689
  issue: 12
  year: 2016
  ident: 10.1016/j.patrec.2019.02.007_bib0036
  article-title: Web video event recognition by semantic analysis from ubiquitous documents
  publication-title: IEEE Trans. Image Process.
  doi: 10.1109/TIP.2016.2614136
  contributor:
    fullname: Yu
– volume: 1
  start-page: 2
  issue: 2
  year: 2014
  ident: 10.1016/j.patrec.2019.02.007_bib0026
  article-title: Action recognition and detection by combining motion and appearance features
  publication-title: THUMOS14 Action Recognition Challenge
  contributor:
    fullname: Wang
– start-page: 2678
  year: 2016
  ident: 10.1016/j.patrec.2019.02.007_bib0033
  article-title: End-to-end learning of action detection from frame glimpses in videos
  contributor:
    fullname: Yeung
– volume: 145
  start-page: 137
  year: 2018
  ident: 10.1016/j.patrec.2019.02.007_bib0038
  article-title: Recurrent attention network using spatial-temporal relations for action recognition
  publication-title: Signal Processing
  doi: 10.1016/j.sigpro.2017.12.008
  contributor:
    fullname: Zhang
– year: 2015
  ident: 10.1016/j.patrec.2019.02.007_bib0034
  article-title: Multi-scale context aggregation by dilated convolutions
  publication-title: arXiv preprint arXiv:1511.07122
  contributor:
    fullname: Yu
– year: 2014
  ident: 10.1016/j.patrec.2019.02.007_bib0015
  article-title: The lear submission at thumos 2014
  contributor:
    fullname: Oneata
– year: 2017
  ident: 10.1016/j.patrec.2019.02.007_bib0008
  article-title: Turn tap: temporal unit regression network for temporal action proposals
  publication-title: arXiv preprint arXiv:1703.06189
  contributor:
    fullname: Gao
– year: 2017
  ident: 10.1016/j.patrec.2019.02.007_bib0027
  article-title: Temporal segment networks for action recognition in videos
  publication-title: arXiv preprint arXiv:1705.02953
  contributor:
    fullname: Wang
– ident: 10.1016/j.patrec.2019.02.007_bib0004
– start-page: 5727
  year: 2017
  ident: 10.1016/j.patrec.2019.02.007_bib0005
  article-title: Temporal context network for activity localization in videos
  contributor:
    fullname: Dai
– start-page: 1914
  year: 2016
  ident: 10.1016/j.patrec.2019.02.007_bib0002
  article-title: Fast temporal activity proposals for efficient detection of human actions in untrimmed videos
  contributor:
    fullname: Caba Heilbron
SSID ssj0006398
Score 2.3820972
Snippet •We propose a network with temporal multi-layer dilated convolution for TAP.•We propose the unit level and proposal level multi-scale aggregation...
Temporal action detection is a very challenging and valuable task for video analysis and applications. The detection results, to a great extent, rely on the...
SourceID proquest
crossref
elsevier
SourceType Aggregation Database
Publisher
StartPage 60
SubjectTerms Agglomeration
Labeling
Proposals
Title Multi-scale aggregation network for temporal action proposals
URI https://dx.doi.org/10.1016/j.patrec.2019.02.007
https://www.proquest.com/docview/2218308997
Volume 122
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1NSwMxEB1qe9GDH1WxWksOXmPb3WQ_Dh5KsVTFXrTQW9jsZktFarH16m93ZpOICiJ43IUsy2Qy700y8wJwoYVEIhoGPEI05yI3GddIjLkuiqg0IhOBof2O-0k0norbmZzVYOh7Yais0sV-G9OraO3edJ01u6vFovtABfTUVolOiTgnMG9vIBzRWW1jcHM3nnwGZAThxEt80wDfQVeVedGWsyEtw35qxTvj3xDqR6yuAGi0D7uOObKB_bkDqJllE_b8rQzMLdIm7HyRGDyEq6rDlq9xKgzL5phdz6u5YEtb_82QtDKnT_XMbJcDW9HVCWt0zCOYjq4fh2PurkzgeRiKDY-lznUepjLTQaJz5ALSIGbncWZiIzEbKIs06-UyJUGjJKRsKikRvZAUlDJIyvAY6suXpTkBlmhkGkWiI5FmQkcGqYPOkKCRwFoR9U0LuDeTWlllDOVLxp6UNasis6peoNCsLYi9LdW3GVYYvP8Y2famV26FrVVA3I7OLOPTf3_4DLbpydYvtqG-eX0z58gxNroDW5fv_Y7zpA9thc-m
link.rule.ids 314,780,784,4502,24116,27924,27925,45585,45679
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LTwIxEJ4QPKgHH6gRRe3BawV2230cPBgiQQUuQsKtaXe7BGNWInj1tzvdtvGRGBOv-8rm63Tmm3bmK8ClYhyJaBjQCKM5ZZmWVCExpirPo0IzyQJt1jtG42gwZfczPqtBz_fCmLJK5_utT6-8tbvSdmi2l4tF-9EU0Ju2SjRKjHMM8_YNxpH9olFfvX_WeWAITrzAt3nc989VRV5mwVkbJcNuaqU749_i0w9PXYWf_h7sON5Ibuyv7UNNlw3Y9WcyEDdFG7D9RWDwAK6r_lq6woHQRM4xt55XI0FKW_1NkLISp071TGyPA1magxNWaJaHMO3fTnoD6g5MoFkYsjWNucpUFqZcqiBRGTIBrjFiZ7HUseaYCxR5KjsZT42cURKaXCopMHYhJSh4kBThEdTLl1IfA0kU8ow8URFLJVORRuKgJNIzI6-WR13dBOphEkuriyF8wdiTsLAKA6voBAJhbULssRTfxleg6_7jzZaHXrj5tRKBYXZmxzI--feHL2BzMBkNxfBu_HAKW-aOrWRsQX39-qbPkG2s1XllTR9JI9B_
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Multi-scale+aggregation+network+for+temporal+action+proposals&rft.jtitle=Pattern+recognition+letters&rft.au=Wang%2C+Zheng&rft.au=Chen%2C+Kai&rft.au=Zhang%2C+Mingxing&rft.au=He%2C+Peilin&rft.date=2019-05-01&rft.pub=Elsevier+B.V&rft.issn=0167-8655&rft.eissn=1872-7344&rft.volume=122&rft.spage=60&rft.epage=65&rft_id=info:doi/10.1016%2Fj.patrec.2019.02.007&rft.externalDocID=S0167865518307840
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0167-8655&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0167-8655&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0167-8655&client=summon