VideoLSTM convolves, attends and flows for action recognition

•To exploit both the spatial and temporal correlations in a video, we hardwire convolutions in the soft-Attention LSTM architecture.•We introduce motion-based attention which guides better the attention towards the relevant spatial-temporal locations of the actions.•We demonstrate how the attention...

Full description

Saved in:
Bibliographic Details
Published inComputer vision and image understanding Vol. 166; pp. 41 - 50
Main Authors Li, Zhenyang, Gavrilyuk, Kirill, Gavves, Efstratios, Jain, Mihir, Snoek, Cees G.M.
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.01.2018
Subjects
Online AccessGet full text

Cover

Loading…
Abstract •To exploit both the spatial and temporal correlations in a video, we hardwire convolutions in the soft-Attention LSTM architecture.•We introduce motion-based attention which guides better the attention towards the relevant spatial-temporal locations of the actions.•We demonstrate how the attention generated from our VideoLSTM can be used for action localization by relying on the action class label only.•We show the theoretical as well as practical merits of our VideoLSTM against other LSTM architectures for action classification and localization. We present VideoLSTM for end-to-end sequence learning of actions in video. Rather than adapting the video to the peculiarities of established recurrent or convolutional architectures, we adapt the architecture to fit the requirements of the video medium. Starting from the soft-Attention LSTM, VideoLSTM makes three novel contributions. First, video has a spatial layout. To exploit the spatial correlation we hardwire convolutions in the soft-Attention LSTM architecture. Second, motion not only informs us about the action content, but also guides better the attention towards the relevant spatio-temporal locations. We introduce motion-based attention. And finally, we demonstrate how the attention from VideoLSTM can be exploited for action localization by relying on the action class label and temporal attention smoothing. Experiments on UCF101, HMDB51 and THUMOS13 reveal the benefit of the video-specific adaptations of VideoLSTM in isolation as well as when integrated in a combined architecture. It compares favorably against other LSTM architectures for action classification and especially action localization.
AbstractList •To exploit both the spatial and temporal correlations in a video, we hardwire convolutions in the soft-Attention LSTM architecture.•We introduce motion-based attention which guides better the attention towards the relevant spatial-temporal locations of the actions.•We demonstrate how the attention generated from our VideoLSTM can be used for action localization by relying on the action class label only.•We show the theoretical as well as practical merits of our VideoLSTM against other LSTM architectures for action classification and localization. We present VideoLSTM for end-to-end sequence learning of actions in video. Rather than adapting the video to the peculiarities of established recurrent or convolutional architectures, we adapt the architecture to fit the requirements of the video medium. Starting from the soft-Attention LSTM, VideoLSTM makes three novel contributions. First, video has a spatial layout. To exploit the spatial correlation we hardwire convolutions in the soft-Attention LSTM architecture. Second, motion not only informs us about the action content, but also guides better the attention towards the relevant spatio-temporal locations. We introduce motion-based attention. And finally, we demonstrate how the attention from VideoLSTM can be exploited for action localization by relying on the action class label and temporal attention smoothing. Experiments on UCF101, HMDB51 and THUMOS13 reveal the benefit of the video-specific adaptations of VideoLSTM in isolation as well as when integrated in a combined architecture. It compares favorably against other LSTM architectures for action classification and especially action localization.
Author Li, Zhenyang
Gavrilyuk, Kirill
Jain, Mihir
Gavves, Efstratios
Snoek, Cees G.M.
Author_xml – sequence: 1
  givenname: Zhenyang
  surname: Li
  fullname: Li, Zhenyang
  email: zhenyangli@uva.nl
  organization: QUVA Lab, University of Amsterdam, Science Park 904, Amsterdam, The Netherlands
– sequence: 2
  givenname: Kirill
  surname: Gavrilyuk
  fullname: Gavrilyuk, Kirill
  organization: QUVA Lab, University of Amsterdam, Science Park 904, Amsterdam, The Netherlands
– sequence: 3
  givenname: Efstratios
  surname: Gavves
  fullname: Gavves, Efstratios
  organization: QUVA Lab, University of Amsterdam, Science Park 904, Amsterdam, The Netherlands
– sequence: 4
  givenname: Mihir
  surname: Jain
  fullname: Jain, Mihir
  organization: QUVA Lab, University of Amsterdam, Science Park 904, Amsterdam, The Netherlands
– sequence: 5
  givenname: Cees G.M.
  surname: Snoek
  fullname: Snoek, Cees G.M.
  organization: QUVA Lab, University of Amsterdam, Science Park 904, Amsterdam, The Netherlands
BookMark eNp9kM1KAzEUhYNUsK2-gKs8gDPmb5oZ0IUUq0LFhVXchTS5IyljIsk44ts7oa5cdHUPh_td7jkzNPHBA0LnlJSU0MXlrjSD-yoZoXI0SkLpEZpS0pCC8eptkrWUBaeCnaBZSjsyboiGTtH1q7MQ1s-bR2yCH0I3QLrAuu_B24S1t7jtwnfCbYhYm94FjyOY8O5d1qfouNVdgrO_OUcvq9vN8r5YP909LG_WheFC9AXlW1JVvN1WkpsFq7SQTNZgJWe2aQzZciKkrGtW8wZMJSWnAMLyRkMrFhXjc8T2d00MKUVo1Wd0Hzr-KEpULkDtVC5A5QKyN8YbofofZFyv89t91K47jF7tURhDDQ6iSsaBN2DdmL5XNrhD-C9Vf3f9
CitedBy_id crossref_primary_10_1016_j_cviu_2023_103854
crossref_primary_10_1109_TPAMI_2022_3183112
crossref_primary_10_1109_JIOT_2024_3350692
crossref_primary_10_1016_j_neucom_2021_04_071
crossref_primary_10_1016_j_jvcir_2020_102846
crossref_primary_10_1016_j_patrec_2018_07_034
crossref_primary_10_1007_s11042_022_12366_5
crossref_primary_10_3390_fi11020042
crossref_primary_10_4018_IJGCMS_371423
crossref_primary_10_1007_s11432_019_2713_1
crossref_primary_10_1007_s12652_021_03017_y
crossref_primary_10_3390_jimaging9040082
crossref_primary_10_3390_s20113126
crossref_primary_10_1155_2022_2013558
crossref_primary_10_1007_s11042_024_18375_w
crossref_primary_10_3233_WEB_220084
crossref_primary_10_1109_ACCESS_2020_3035659
crossref_primary_10_1109_TPAMI_2022_3193611
crossref_primary_10_1109_TIP_2018_2851672
crossref_primary_10_1109_ACCESS_2020_3036865
crossref_primary_10_1016_j_imavis_2024_105144
crossref_primary_10_1016_j_aej_2023_11_017
crossref_primary_10_1016_j_image_2020_115967
crossref_primary_10_1109_TCSVT_2021_3070688
crossref_primary_10_1109_JSYST_2020_3001680
crossref_primary_10_1016_j_eswa_2019_112927
crossref_primary_10_1016_j_neucom_2022_09_071
crossref_primary_10_1016_j_eswa_2023_122223
crossref_primary_10_1109_TIP_2020_2987425
crossref_primary_10_1155_2022_2067990
crossref_primary_10_1109_ACCESS_2023_3282311
crossref_primary_10_1016_j_cviu_2022_103442
crossref_primary_10_3390_a16080369
crossref_primary_10_1587_transinf_2020EDL0002
crossref_primary_10_1109_TMM_2021_3066775
crossref_primary_10_1007_s11831_023_09986_x
crossref_primary_10_1109_ACCESS_2020_2977856
crossref_primary_10_1016_j_inffus_2022_03_001
crossref_primary_10_1016_j_patrec_2019_09_016
crossref_primary_10_1038_s41598_022_13220_2
crossref_primary_10_1109_ACCESS_2022_3158667
crossref_primary_10_1080_1206212X_2018_1486001
crossref_primary_10_3390_electronics8101169
crossref_primary_10_1109_TMM_2018_2862341
crossref_primary_10_3390_jimaging9030060
crossref_primary_10_1007_s11042_023_14355_8
crossref_primary_10_1109_TIP_2020_2989864
crossref_primary_10_1007_s11042_020_09004_3
crossref_primary_10_1109_TMC_2024_3462466
crossref_primary_10_1109_TCE_2024_3373824
crossref_primary_10_1155_2022_6608448
crossref_primary_10_1109_TMM_2019_2953814
crossref_primary_10_1016_j_aej_2023_05_050
crossref_primary_10_1007_s11042_021_11022_8
crossref_primary_10_1007_s11063_020_10248_1
crossref_primary_10_1007_s00521_023_08607_9
crossref_primary_10_1109_ACCESS_2021_3083273
crossref_primary_10_3390_bioengineering11121180
crossref_primary_10_1016_j_cviu_2019_102898
crossref_primary_10_1016_j_patcog_2019_107099
crossref_primary_10_1109_TAFFC_2020_2987021
crossref_primary_10_3390_app10072300
crossref_primary_10_1177_09544070231207160
crossref_primary_10_1007_s13735_024_00338_4
crossref_primary_10_1016_j_jvcir_2021_103344
crossref_primary_10_1007_s00521_020_05144_7
crossref_primary_10_3390_s20247299
crossref_primary_10_3390_s23094258
crossref_primary_10_1007_s42979_021_00576_x
crossref_primary_10_1109_ACCESS_2024_3475372
crossref_primary_10_3390_s23115276
crossref_primary_10_1080_08839514_2022_2093705
crossref_primary_10_3390_app8030383
crossref_primary_10_1016_j_eswa_2023_120642
crossref_primary_10_1049_ipr2_12309
crossref_primary_10_3390_s19194129
crossref_primary_10_1088_1757_899X_495_1_012031
crossref_primary_10_1016_j_eswa_2024_124634
crossref_primary_10_1016_j_neunet_2021_08_030
crossref_primary_10_1109_ACCESS_2019_2962284
crossref_primary_10_32604_cmc_2024_049512
crossref_primary_10_1109_TPAMI_2020_3029799
crossref_primary_10_1016_j_cviu_2019_102799
crossref_primary_10_1007_s10845_024_02518_9
crossref_primary_10_1109_JETCAS_2019_2935207
crossref_primary_10_1016_j_cviu_2019_102794
crossref_primary_10_1109_TPAMI_2020_3029554
crossref_primary_10_1016_j_neucom_2024_127567
crossref_primary_10_1109_TMM_2019_2903455
crossref_primary_10_1109_TPAMI_2021_3100277
crossref_primary_10_32604_cmc_2023_042494
crossref_primary_10_1007_s00371_023_03073_9
crossref_primary_10_1142_S021800142350009X
crossref_primary_10_1007_s11042_021_11093_7
crossref_primary_10_1109_ACCESS_2018_2887144
crossref_primary_10_1007_s11042_023_17815_3
crossref_primary_10_1109_TPAMI_2021_3058649
crossref_primary_10_1109_TCSVT_2022_3210271
crossref_primary_10_1007_s11042_019_7404_z
crossref_primary_10_1109_ACCESS_2020_3006700
crossref_primary_10_1109_TNNLS_2020_2978613
crossref_primary_10_3390_app10010374
crossref_primary_10_3233_IA_190021
crossref_primary_10_3390_s21041106
crossref_primary_10_1109_ACCESS_2020_2992740
crossref_primary_10_1109_TCSVT_2019_2927118
crossref_primary_10_1007_s10462_020_09838_1
crossref_primary_10_1016_j_neucom_2022_07_040
crossref_primary_10_1109_JIOT_2023_3265645
crossref_primary_10_1007_s13246_022_01178_4
crossref_primary_10_1016_j_patcog_2024_110704
crossref_primary_10_1109_ACCESS_2023_3293813
crossref_primary_10_1007_s10462_022_10210_8
crossref_primary_10_1109_TIP_2022_3196175
crossref_primary_10_1109_TPAMI_2024_3367412
crossref_primary_10_1016_j_cviu_2023_103911
crossref_primary_10_3390_s22041464
crossref_primary_10_1016_j_inffus_2023_102211
crossref_primary_10_3389_frobt_2019_00132
crossref_primary_10_1049_iet_ipr_2019_0963
crossref_primary_10_1109_TIP_2023_3269228
crossref_primary_10_1155_2021_6794202
crossref_primary_10_1109_TMM_2019_2959425
crossref_primary_10_1111_mice_12695
crossref_primary_10_1016_j_knosys_2021_106970
crossref_primary_10_1109_TMM_2020_3025665
crossref_primary_10_1007_s41095_022_0271_y
crossref_primary_10_1109_TIP_2021_3112008
crossref_primary_10_1016_j_jvcir_2021_103112
crossref_primary_10_1587_transinf_2019EDP7045
crossref_primary_10_1007_s10462_022_10148_x
crossref_primary_10_1109_JBHI_2018_2808281
crossref_primary_10_1049_iet_cvi_2018_5830
crossref_primary_10_1109_JSEN_2020_3019258
crossref_primary_10_1109_TETCI_2019_2901540
crossref_primary_10_1007_s11042_020_09992_2
crossref_primary_10_1049_iet_cvi_2019_0761
crossref_primary_10_1016_j_imavis_2018_12_002
crossref_primary_10_1016_j_neucom_2022_04_063
crossref_primary_10_1155_2024_1052344
crossref_primary_10_1109_TCSVT_2018_2887283
crossref_primary_10_1109_TIP_2020_2996086
crossref_primary_10_1109_TIP_2024_3512369
crossref_primary_10_1049_ipr2_13104
crossref_primary_10_1109_TPAMI_2021_3058606
crossref_primary_10_1016_j_jvcir_2020_102929
crossref_primary_10_1016_j_patcog_2023_109985
crossref_primary_10_1007_s10489_020_01905_y
crossref_primary_10_1016_j_neucom_2020_06_032
crossref_primary_10_1016_j_neucom_2020_05_118
crossref_primary_10_1016_j_neucom_2024_127256
crossref_primary_10_1109_ACCESS_2019_2936628
crossref_primary_10_1109_ACCESS_2020_3037529
crossref_primary_10_1016_j_neucom_2019_05_058
crossref_primary_10_1007_s41870_023_01183_0
crossref_primary_10_1155_2022_8424303
crossref_primary_10_1142_S0218001422550023
crossref_primary_10_3390_jimaging9070130
crossref_primary_10_1016_j_neucom_2018_10_095
crossref_primary_10_1016_j_rse_2021_112603
crossref_primary_10_1371_journal_pone_0265115
crossref_primary_10_1109_TCSVT_2023_3274108
crossref_primary_10_1109_TIP_2022_3189810
crossref_primary_10_1016_j_isatra_2021_06_041
crossref_primary_10_1016_j_commatsci_2022_111927
crossref_primary_10_1007_s10489_024_05775_6
crossref_primary_10_1016_j_knosys_2021_106918
crossref_primary_10_1007_s11042_021_10633_5
crossref_primary_10_1109_TPAMI_2024_3426998
crossref_primary_10_1007_s00138_023_01457_4
crossref_primary_10_3390_s21051656
crossref_primary_10_1007_s10489_021_02329_y
crossref_primary_10_3389_fphy_2023_1095277
crossref_primary_10_1109_TPAMI_2021_3061479
crossref_primary_10_1002_ima_22430
crossref_primary_10_3389_ffutr_2021_688482
crossref_primary_10_1007_s11263_018_1120_4
crossref_primary_10_1111_exsy_13474
crossref_primary_10_1145_3378026
crossref_primary_10_3389_fnbot_2022_918434
crossref_primary_10_3390_app14010230
crossref_primary_10_1016_j_ins_2020_05_038
crossref_primary_10_1109_ACCESS_2019_2936604
crossref_primary_10_1007_s00521_021_06585_4
crossref_primary_10_1155_2021_6614002
crossref_primary_10_1007_s11042_023_17345_y
crossref_primary_10_1109_TPAMI_2021_3064878
crossref_primary_10_3233_JIFS_18209
crossref_primary_10_4218_etrij_2019_0510
crossref_primary_10_3390_info10090269
crossref_primary_10_1007_s00521_024_10463_0
crossref_primary_10_1109_TIP_2020_2984904
crossref_primary_10_1142_S0219720022500093
crossref_primary_10_1016_j_vlsi_2019_07_005
crossref_primary_10_1109_TCSS_2023_3270164
crossref_primary_10_1007_s11042_023_15501_y
crossref_primary_10_1145_3436754
crossref_primary_10_1007_s10462_023_10650_w
crossref_primary_10_1142_S0219467824500554
crossref_primary_10_1016_j_patrec_2021_02_025
crossref_primary_10_1109_TNNLS_2019_2919764
crossref_primary_10_1109_TPAMI_2023_3268134
crossref_primary_10_1007_s10489_018_1395_8
crossref_primary_10_1016_j_asoc_2021_107102
Cites_doi 10.1162/neco.1997.9.8.1735
10.1109/5.726791
ContentType Journal Article
Copyright 2017 Elsevier Inc.
Copyright_xml – notice: 2017 Elsevier Inc.
DBID AAYXX
CITATION
DOI 10.1016/j.cviu.2017.10.011
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
Engineering
Computer Science
EISSN 1090-235X
EndPage 50
ExternalDocumentID 10_1016_j_cviu_2017_10_011
S1077314217301741
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
1B1
1~.
1~5
29F
4.4
457
4G.
5GY
5VS
6TJ
7-5
71M
8P~
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKC
AAIKJ
AAKOC
AALRI
AAMNW
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABEFU
ABFNM
ABJNI
ABMAC
ABXDB
ABYKQ
ACDAQ
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADFGL
ADJOM
ADMUD
ADTZH
AEBSH
AECPX
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CAG
COF
CS3
DM4
DU5
EBS
EFBJH
EFLBG
EJD
EO8
EO9
EP2
EP3
F0J
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
GBOLZ
HF~
HVGLF
HZ~
IHE
J1W
JJJVA
KOM
LG5
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
RNS
ROL
RPZ
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SSV
SSZ
T5K
TN5
XPP
ZMT
~G-
AATTM
AAXKI
AAYWO
AAYXX
ABWVN
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGCQF
AGQPQ
AGRNS
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
BNPGV
CITATION
SST
ID FETCH-LOGICAL-c344t-13b0553fb573c625a47278ed732d99c0b30477882839ec57731ee4d39aef46523
IEDL.DBID .~1
ISSN 1077-3142
IngestDate Thu Apr 24 23:05:01 EDT 2025
Tue Jul 01 04:32:06 EDT 2025
Fri Feb 23 02:26:56 EST 2024
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Keywords Action recognition
LSTM
Attention
Video representation
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c344t-13b0553fb573c625a47278ed732d99c0b30477882839ec57731ee4d39aef46523
OpenAccessLink https://www.sciencedirect.com/science/article/pii/S1077314217301741
PageCount 10
ParticipantIDs crossref_primary_10_1016_j_cviu_2017_10_011
crossref_citationtrail_10_1016_j_cviu_2017_10_011
elsevier_sciencedirect_doi_10_1016_j_cviu_2017_10_011
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate January 2018
2018-01-00
PublicationDateYYYYMMDD 2018-01-01
PublicationDate_xml – month: 01
  year: 2018
  text: January 2018
PublicationDecade 2010
PublicationTitle Computer vision and image understanding
PublicationYear 2018
Publisher Elsevier Inc
Publisher_xml – name: Elsevier Inc
References Cinbis, Verbeek, Schmid (bib0005) 2014
Weinzaepfel, Harchaoui, Schmid (bib0051) 2015
Bahdanau, Cho, Bengio (bib0003) 2015
Wang, Qiao, Tang (bib0048) 2015
Peng, Wang, Wang, Qiao (bib0031) 2016; 150
Saha, Singh, Sapienza, Torr, Cuzzolin (bib0034) 2016
Ballas, Yao, Pal, Courville (bib0004) 2016
Yu, Yuan (bib0054) 2015
Mettes, van Gemert, Snoek (bib0030) 2016
Peng, Zou, Qiao, Peng (bib0032) 2014
Jain, Jégou, Bouthemy (bib0019) 2013
Krizhevsky, Sutskever, Hinton (bib0025) 2012
Jiang, Y.-G., Liu, J., Roshan Zamir, A., Laptev, I., Piccardi, M., Shah, M., Sukthankar, R., 2013. THUMOS challenge: action recognition with a large number of classes.
de Souza, Gaidon, Vig, López (bib0041) 2016
Lan, Lin, Li, Hauptmann, Raj (bib0027) 2015
Feichtenhofer, Pinz, Wildes (bib0008) 2016
Simonyan, Zisserman (bib0038) 2014
Srivastava, Hinton, Krizhevsky, Sutskever, Salakhutdinov (bib0042) 2014; 15
Wang, Qiao, Tang (bib0049) 2015
Ji, Xu, Yang, Yu (bib0020) 2013; 35
Wang, Xiong, Wang, Qiao, Lin, Tang, Van Gool (bib0050) 2016
Baccouche, Mamalet, Wolf, Garcia, Baskurt (bib0002) 2011
Zach, Pock, Bischof (bib0056) 2007
Itti, Koch, Niebur (bib0016) 1998; 20
Srivastava, Mansimov, Salakhutdinov (bib0043) 2015
Kuehne, Jhuang, Garrote, Poggio, Serre (bib0026) 2011
Feichtenhofer, Pinz, Wildes (bib0007) 2017
Jia, Gavves, Fernando, Tuytelaars (bib0021) 2015
Soomro, Zamir, Shah (bib0040) 2012
Jain, van Gemert, Jégou, Bouthemy, Snoek (bib0017) 2014
Zhu, Hu, Sun, Cao, Qiao (bib0057) 2016
Sharma, Kiros, Salakhutdinov (bib0035) 2015
Feichtenhofer, Pinz, Zisserman (bib0009) 2016
Wang, Schmid (bib0047) 2013
Wu, Wang, Jiang, Ye, Xue (bib0052) 2015
Tieleman, Hinton (bib0045) 2012
Baccouche, Mamalet, Wolf, Garcia, Baskurt (bib0001) 2010
Fernando, Gavves, Oramas, Ghodrati, Tuytelaars (bib0010) 2015
LeCun, Bottou, Bengio, Haffner (bib0028) 1998; 86
Lev, Sadeh, Klein, Wolf (bib0029) 2016
Xu, Ba, Kiros, Cho, Courville, Salakhutdinov, Zemel, Bengio (bib0053) 2015
Szegedy, Liu, Jia, Sermanet, Reed, Anguelov, Erhan, Vanhoucke, Rabinovich (bib0044) 2015
Hochreiter, Schmidhuber (bib0014) 1997; 9
Jain, van Gemert, Snoek (bib0018) 2015
Yue-Hei Ng, Hausknecht, Vijayanarasimhan, Vinyals, Monga, Toderici (bib0055) 2015
Heilbron, Escorcia, Ghanem, Niebles (bib0013) 2015
Karpathy, Fei-Fei (bib0023) 2015
Karpathy, Toderici, Shetty, Leung, Sukthankar, Fei-Fei (bib0024) 2014
van Gemert, Jain, Gati, Snoek (bib0011) 2015
He, Zhang, Ren, Sun (bib0012) 2015
Tran, Bourdev, Fergus, Torresani, Paluri (bib0046) 2015
Sadanand, Corso (bib0033) 2012
Idrees, Zamir, Jiang, Gorban, Laptev, Sukthankar, Shah (bib0015) 2017; 155
Sharma, Kiros, Salakhutdinov (bib0036) 2016
Shi, Chen, Wang, Yeung, Wong, Woo (bib0037) 2015
Donahue, Hendricks, Guadarrama, Rohrbach, Venugopalan, Saenko, Darrell (bib0006) 2015
Simonyan, Zisserman (bib0039) 2015
Heilbron (10.1016/j.cviu.2017.10.011_bib0013) 2015
Donahue (10.1016/j.cviu.2017.10.011_bib0006) 2015
LeCun (10.1016/j.cviu.2017.10.011_bib0028) 1998; 86
Jain (10.1016/j.cviu.2017.10.011_bib0019) 2013
Wang (10.1016/j.cviu.2017.10.011_bib0047) 2013
Wang (10.1016/j.cviu.2017.10.011_bib0050) 2016
Simonyan (10.1016/j.cviu.2017.10.011_bib0038) 2014
Srivastava (10.1016/j.cviu.2017.10.011_bib0043) 2015
Feichtenhofer (10.1016/j.cviu.2017.10.011_bib0009) 2016
Weinzaepfel (10.1016/j.cviu.2017.10.011_bib0051) 2015
Cinbis (10.1016/j.cviu.2017.10.011_bib0005) 2014
Wu (10.1016/j.cviu.2017.10.011_bib0052) 2015
Sharma (10.1016/j.cviu.2017.10.011_bib0036) 2016
Jia (10.1016/j.cviu.2017.10.011_bib0021) 2015
Ji (10.1016/j.cviu.2017.10.011_bib0020) 2013; 35
Wang (10.1016/j.cviu.2017.10.011_bib0048) 2015
Simonyan (10.1016/j.cviu.2017.10.011_bib0039) 2015
Yue-Hei Ng (10.1016/j.cviu.2017.10.011_bib0055) 2015
He (10.1016/j.cviu.2017.10.011_bib0012) 2015
Lev (10.1016/j.cviu.2017.10.011_bib0029) 2016
Krizhevsky (10.1016/j.cviu.2017.10.011_bib0025) 2012
Lan (10.1016/j.cviu.2017.10.011_bib0027) 2015
Jain (10.1016/j.cviu.2017.10.011_bib0018) 2015
Baccouche (10.1016/j.cviu.2017.10.011_bib0002) 2011
Xu (10.1016/j.cviu.2017.10.011_bib0053) 2015
Feichtenhofer (10.1016/j.cviu.2017.10.011_bib0008) 2016
Idrees (10.1016/j.cviu.2017.10.011_bib0015) 2017; 155
Sharma (10.1016/j.cviu.2017.10.011_bib0035) 2015
Peng (10.1016/j.cviu.2017.10.011_bib0031) 2016; 150
Tieleman (10.1016/j.cviu.2017.10.011_bib0045) 2012
Yu (10.1016/j.cviu.2017.10.011_bib0054) 2015
de Souza (10.1016/j.cviu.2017.10.011_bib0041) 2016
Ballas (10.1016/j.cviu.2017.10.011_bib0004) 2016
10.1016/j.cviu.2017.10.011_bib0022
Tran (10.1016/j.cviu.2017.10.011_bib0046) 2015
Sadanand (10.1016/j.cviu.2017.10.011_bib0033) 2012
Wang (10.1016/j.cviu.2017.10.011_bib0049) 2015
Baccouche (10.1016/j.cviu.2017.10.011_bib0001) 2010
Jain (10.1016/j.cviu.2017.10.011_bib0017) 2014
Szegedy (10.1016/j.cviu.2017.10.011_bib0044) 2015
Feichtenhofer (10.1016/j.cviu.2017.10.011_bib0007) 2017
Kuehne (10.1016/j.cviu.2017.10.011_bib0026) 2011
Saha (10.1016/j.cviu.2017.10.011_bib0034) 2016
Peng (10.1016/j.cviu.2017.10.011_bib0032) 2014
Fernando (10.1016/j.cviu.2017.10.011_bib0010) 2015
Karpathy (10.1016/j.cviu.2017.10.011_bib0023) 2015
Karpathy (10.1016/j.cviu.2017.10.011_bib0024) 2014
van Gemert (10.1016/j.cviu.2017.10.011_bib0011) 2015
Itti (10.1016/j.cviu.2017.10.011_bib0016) 1998; 20
Mettes (10.1016/j.cviu.2017.10.011_bib0030) 2016
Hochreiter (10.1016/j.cviu.2017.10.011_bib0014) 1997; 9
Bahdanau (10.1016/j.cviu.2017.10.011_bib0003) 2015
Shi (10.1016/j.cviu.2017.10.011_bib0037) 2015
Zach (10.1016/j.cviu.2017.10.011_bib0056) 2007
Zhu (10.1016/j.cviu.2017.10.011_bib0057) 2016
Soomro (10.1016/j.cviu.2017.10.011_bib0040) 2012
Srivastava (10.1016/j.cviu.2017.10.011_bib0042) 2014; 15
References_xml – year: 2016
  ident: bib0008
  article-title: Spatiotemporal residual networks for video action recognition
  publication-title: In: NIPS
– year: 2014
  ident: bib0005
  article-title: Multi-fold mil training for weakly supervised object localization
  publication-title: In: CVPR
– year: 2016
  ident: bib0036
  article-title: Action recognition using visual attention
  publication-title: In: ICLR workshop
– year: 2014
  ident: bib0017
  article-title: Action localization by tubelets from motion
  publication-title: In: CVPR
– year: 2015
  ident: bib0055
  article-title: Beyond short snippets: deep networks for video classification
  publication-title: In: CVPR
– year: 2014
  ident: bib0032
  article-title: Action recognition with stacked fisher vectors
  publication-title: In: ECCV
– year: 2015
  ident: bib0043
  article-title: Unsupervised learning of video representations using LSTMs
  publication-title: In: ICML
– year: 2015
  ident: bib0010
  article-title: Modeling video evolution for action recognition
  publication-title: In: CVPR
– year: 2015
  ident: bib0048
  article-title: Action recognition with trajectory-pooled deep-convolutional descriptors
  publication-title: In: CVPR
– year: 2016
  ident: bib0057
  article-title: A key volume mining deep framework for action recognition
  publication-title: In: CVPR
– volume: 20
  start-page: 1254
  year: 1998
  end-page: 1259
  ident: bib0016
  article-title: A model of saliency-based visual attention for rapid scene analysis
  publication-title: In: PAMI
– year: 2015
  ident: bib0037
  article-title: Convolutional LSTM network: a machine learning approach for precipitation nowcasting
  publication-title: In: NIPS
– volume: 150
  start-page: 109
  year: 2016
  end-page: 125
  ident: bib0031
  article-title: Bag of visual words and fusion methods for action recognition: comprehensive study and good practice
  publication-title: In: CVIU
– year: 2015
  ident: bib0035
  article-title: Action recognition using visual attention
  publication-title: In: NIPS workshop
– year: 2016
  ident: bib0050
  article-title: Temporal segment networks: towards good practices for deep action recognition
  publication-title: In: ECCV
– year: 2015
  ident: bib0053
  article-title: Show, attend and tell: neural image caption generation with visual attention
  publication-title: In: ICML
– year: 2015
  ident: bib0018
  article-title: What do 15,000 object categories tell us about classifying and localizing actions?
  publication-title: In: CVPR
– volume: 155
  start-page: 1
  year: 2017
  end-page: 23
  ident: bib0015
  article-title: The THUMOS challenge on action recognition for videos “in the wild”
  publication-title: In: CVIU
– year: 2015
  ident: bib0003
  article-title: Neural machine translation by jointly learning to align and translate
  publication-title: In: ICLR
– volume: 86
  start-page: 2278
  year: 1998
  end-page: 2324
  ident: bib0028
  article-title: Gradient-based learning applied to document recognition
  publication-title: Proc. IEEE
– year: 2015
  ident: bib0021
  article-title: Guiding the long-short term memory model for image caption generation
  publication-title: In: ICCV
– year: 2015
  ident: bib0006
  article-title: Long-term recurrent convolutional networks for visual recognition and description
  publication-title: In: CVPR
– year: 2011
  ident: bib0026
  article-title: HMDB: a large video database for human motion recognition
  publication-title: In: ICCV
– year: 2015
  ident: bib0013
  article-title: ActivityNet: a large-scale video benchmark for human activity understanding
  publication-title: In: CVPR
– year: 2012
  ident: bib0040
  article-title: UCF101: a dataset of 101 human actions classes from videos in the wild
  publication-title: In: CoRR
– reference: Jiang, Y.-G., Liu, J., Roshan Zamir, A., Laptev, I., Piccardi, M., Shah, M., Sukthankar, R., 2013. THUMOS challenge: action recognition with a large number of classes.
– year: 2015
  ident: bib0054
  article-title: Fast action proposals for human action detection and search
  publication-title: In: CVPR
– year: 2013
  ident: bib0047
  article-title: Action recognition with improved trajectories
  publication-title: In: ICCV
– volume: 15
  start-page: 1929
  year: 2014
  end-page: 1958
  ident: bib0042
  article-title: Dropout: a simple way to prevent neural networks from overfitting
  publication-title: In: JMLR
– year: 2016
  ident: bib0009
  article-title: Convolutional two-stream network fusion for video action recognition
  publication-title: In: CVPR
– year: 2014
  ident: bib0024
  article-title: Large-scale video classification with convolutional neural networks
  publication-title: In: CVPR
– year: 2015
  ident: bib0027
  article-title: Beyond Gaussian pyramid: multi-skip feature stacking for action recognition
  publication-title: In: CVPR
– year: 2016
  ident: bib0030
  article-title: Spot on: action localization from pointly-supervised proposals
  publication-title: In: ECCV
– volume: 35
  start-page: 221
  year: 2013
  end-page: 231
  ident: bib0020
  article-title: 3d convolutional neural networks for human action recognition
  publication-title: In: PAMI
– year: 2015
  ident: bib0051
  article-title: Learning to track for spatio-temporal action localization
  publication-title: In: ICCV
– year: 2014
  ident: bib0038
  article-title: Two-stream convolutional networks for action recognition in videos
  publication-title: In: NIPS
– year: 2012
  ident: bib0045
  article-title: Rmsprop: divide the gradient by a running average of its recent magnitude
  publication-title: Coursera Course: Neural Networks for Machine Learning
– year: 2015
  ident: bib0039
  article-title: Very deep convolutional networks for large-scale image recognition
  publication-title: In: ICLR
– year: 2015
  ident: bib0012
  article-title: Deep residual learning for image recognition
  publication-title: In: CoRR
– year: 2012
  ident: bib0033
  article-title: Action bank: a high-level representation of activity in video
  publication-title: In: CVPR
– year: 2015
  ident: bib0044
  article-title: Going deeper with convolutions
  publication-title: In: CVPR
– year: 2013
  ident: bib0019
  article-title: Better exploiting motion for better action recognition
  publication-title: In: CVPR
– year: 2015
  ident: bib0023
  article-title: Deep visual-semantic alignments for generating image descriptions
  publication-title: In: CVPR
– year: 2016
  ident: bib0004
  article-title: Delving deeper into convolutional networks for learning video representations
  publication-title: In: ICLR
– year: 2015
  ident: bib0049
  article-title: Action recognition with trajectory-pooled deep-convolutional descriptors
  publication-title: In: CVPR
– year: 2010
  ident: bib0001
  article-title: Action classification in soccer videos with long short-term memory recurrent neural networks
  publication-title: In: ICANN
– year: 2012
  ident: bib0025
  article-title: Imagenet classification with deep convolutional neural networks
  publication-title: In: NIPS
– year: 2015
  ident: bib0052
  article-title: Modeling spatial-temporal clues in a hybrid deep learning framework for video classification
  publication-title: In: MM
– year: 2011
  ident: bib0002
  article-title: Sequential deep learning for human action recognition
  publication-title: In: HBU
– year: 2016
  ident: bib0034
  article-title: Deep learning for detecting multiple space-time action tubes in videos
  publication-title: In: BMVC
– year: 2015
  ident: bib0046
  article-title: Learning spatiotemporal features with 3d convolutional networks
  publication-title: In: ICCV
– volume: 9
  start-page: 1735
  year: 1997
  end-page: 1780
  ident: bib0014
  article-title: Long short-term memory
  publication-title: Neural Comput.
– year: 2016
  ident: bib0029
  article-title: RNN Fisher vectors for action recognition and image annotation
  publication-title: In: ECCV
– year: 2015
  ident: bib0011
  article-title: APT: action localization proposals from dense trajectories
  publication-title: In: BMVC
– year: 2007
  ident: bib0056
  article-title: A duality based approach for realtime tv-l1 optical flow
  publication-title: DAGM Symposium on Pattern Recognition
– year: 2017
  ident: bib0007
  article-title: Spatiotemporal multiplier networks for video action recognition
  publication-title: In: CVPR
– year: 2016
  ident: bib0041
  article-title: Sympathy for the details: Dense trajectories and hybrid classification architectures for action recognition
  publication-title: In: ECCV
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0023
  article-title: Deep visual-semantic alignments for generating image descriptions
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0003
  article-title: Neural machine translation by jointly learning to align and translate
– year: 2013
  ident: 10.1016/j.cviu.2017.10.011_bib0019
  article-title: Better exploiting motion for better action recognition
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0051
  article-title: Learning to track for spatio-temporal action localization
– volume: 155
  start-page: 1
  year: 2017
  ident: 10.1016/j.cviu.2017.10.011_bib0015
  article-title: The THUMOS challenge on action recognition for videos “in the wild”
  publication-title: In: CVIU
– volume: 9
  start-page: 1735
  year: 1997
  ident: 10.1016/j.cviu.2017.10.011_bib0014
  article-title: Long short-term memory
  publication-title: Neural Comput.
  doi: 10.1162/neco.1997.9.8.1735
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0011
  article-title: APT: action localization proposals from dense trajectories
– year: 2014
  ident: 10.1016/j.cviu.2017.10.011_bib0017
  article-title: Action localization by tubelets from motion
– year: 2016
  ident: 10.1016/j.cviu.2017.10.011_bib0034
  article-title: Deep learning for detecting multiple space-time action tubes in videos
– year: 2007
  ident: 10.1016/j.cviu.2017.10.011_bib0056
  article-title: A duality based approach for realtime tv-l1 optical flow
– year: 2012
  ident: 10.1016/j.cviu.2017.10.011_bib0025
  article-title: Imagenet classification with deep convolutional neural networks
– year: 2012
  ident: 10.1016/j.cviu.2017.10.011_bib0040
  article-title: UCF101: a dataset of 101 human actions classes from videos in the wild
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0046
  article-title: Learning spatiotemporal features with 3d convolutional networks
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0054
  article-title: Fast action proposals for human action detection and search
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0010
  article-title: Modeling video evolution for action recognition
– year: 2014
  ident: 10.1016/j.cviu.2017.10.011_bib0038
  article-title: Two-stream convolutional networks for action recognition in videos
– year: 2010
  ident: 10.1016/j.cviu.2017.10.011_bib0001
  article-title: Action classification in soccer videos with long short-term memory recurrent neural networks
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0049
  article-title: Action recognition with trajectory-pooled deep-convolutional descriptors
– year: 2016
  ident: 10.1016/j.cviu.2017.10.011_bib0036
  article-title: Action recognition using visual attention
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0012
  article-title: Deep residual learning for image recognition
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0043
  article-title: Unsupervised learning of video representations using LSTMs
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0044
  article-title: Going deeper with convolutions
– year: 2016
  ident: 10.1016/j.cviu.2017.10.011_bib0030
  article-title: Spot on: action localization from pointly-supervised proposals
– volume: 150
  start-page: 109
  year: 2016
  ident: 10.1016/j.cviu.2017.10.011_bib0031
  article-title: Bag of visual words and fusion methods for action recognition: comprehensive study and good practice
  publication-title: In: CVIU
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0013
  article-title: ActivityNet: a large-scale video benchmark for human activity understanding
– ident: 10.1016/j.cviu.2017.10.011_bib0022
– year: 2013
  ident: 10.1016/j.cviu.2017.10.011_bib0047
  article-title: Action recognition with improved trajectories
– year: 2016
  ident: 10.1016/j.cviu.2017.10.011_bib0050
  article-title: Temporal segment networks: towards good practices for deep action recognition
– year: 2016
  ident: 10.1016/j.cviu.2017.10.011_bib0004
  article-title: Delving deeper into convolutional networks for learning video representations
– year: 2016
  ident: 10.1016/j.cviu.2017.10.011_bib0029
  article-title: RNN Fisher vectors for action recognition and image annotation
– year: 2016
  ident: 10.1016/j.cviu.2017.10.011_bib0057
  article-title: A key volume mining deep framework for action recognition
– year: 2014
  ident: 10.1016/j.cviu.2017.10.011_bib0005
  article-title: Multi-fold mil training for weakly supervised object localization
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0048
  article-title: Action recognition with trajectory-pooled deep-convolutional descriptors
– year: 2014
  ident: 10.1016/j.cviu.2017.10.011_bib0032
  article-title: Action recognition with stacked fisher vectors
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0021
  article-title: Guiding the long-short term memory model for image caption generation
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0027
  article-title: Beyond Gaussian pyramid: multi-skip feature stacking for action recognition
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0035
  article-title: Action recognition using visual attention
– volume: 15
  start-page: 1929
  year: 2014
  ident: 10.1016/j.cviu.2017.10.011_bib0042
  article-title: Dropout: a simple way to prevent neural networks from overfitting
  publication-title: In: JMLR
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0018
  article-title: What do 15,000 object categories tell us about classifying and localizing actions?
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0053
  article-title: Show, attend and tell: neural image caption generation with visual attention
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0039
  article-title: Very deep convolutional networks for large-scale image recognition
– volume: 35
  start-page: 221
  year: 2013
  ident: 10.1016/j.cviu.2017.10.011_bib0020
  article-title: 3d convolutional neural networks for human action recognition
  publication-title: In: PAMI
– year: 2016
  ident: 10.1016/j.cviu.2017.10.011_bib0008
  article-title: Spatiotemporal residual networks for video action recognition
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0006
  article-title: Long-term recurrent convolutional networks for visual recognition and description
– year: 2011
  ident: 10.1016/j.cviu.2017.10.011_bib0026
  article-title: HMDB: a large video database for human motion recognition
– year: 2017
  ident: 10.1016/j.cviu.2017.10.011_bib0007
  article-title: Spatiotemporal multiplier networks for video action recognition
– year: 2012
  ident: 10.1016/j.cviu.2017.10.011_bib0033
  article-title: Action bank: a high-level representation of activity in video
– year: 2016
  ident: 10.1016/j.cviu.2017.10.011_bib0009
  article-title: Convolutional two-stream network fusion for video action recognition
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0055
  article-title: Beyond short snippets: deep networks for video classification
– volume: 20
  start-page: 1254
  year: 1998
  ident: 10.1016/j.cviu.2017.10.011_bib0016
  article-title: A model of saliency-based visual attention for rapid scene analysis
  publication-title: In: PAMI
– year: 2012
  ident: 10.1016/j.cviu.2017.10.011_bib0045
  article-title: Rmsprop: divide the gradient by a running average of its recent magnitude
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0052
  article-title: Modeling spatial-temporal clues in a hybrid deep learning framework for video classification
– volume: 86
  start-page: 2278
  year: 1998
  ident: 10.1016/j.cviu.2017.10.011_bib0028
  article-title: Gradient-based learning applied to document recognition
  publication-title: Proc. IEEE
  doi: 10.1109/5.726791
– year: 2015
  ident: 10.1016/j.cviu.2017.10.011_bib0037
  article-title: Convolutional LSTM network: a machine learning approach for precipitation nowcasting
– year: 2016
  ident: 10.1016/j.cviu.2017.10.011_bib0041
  article-title: Sympathy for the details: Dense trajectories and hybrid classification architectures for action recognition
– year: 2014
  ident: 10.1016/j.cviu.2017.10.011_bib0024
  article-title: Large-scale video classification with convolutional neural networks
– year: 2011
  ident: 10.1016/j.cviu.2017.10.011_bib0002
  article-title: Sequential deep learning for human action recognition
SSID ssj0011491
Score 2.6565652
Snippet •To exploit both the spatial and temporal correlations in a video, we hardwire convolutions in the soft-Attention LSTM architecture.•We introduce motion-based...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 41
SubjectTerms Action recognition
Attention
LSTM
Video representation
Title VideoLSTM convolves, attends and flows for action recognition
URI https://dx.doi.org/10.1016/j.cviu.2017.10.011
Volume 166
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT8JAEN4QvOjBB2rEB9mDNy1Qd7fbHjwQIsEHXADDrWn3kWBISyzgzd_ubLslmBgOHjvZSZrp7Ow3229mELr1AMIrwnxHKdJ2qKaRE2ilHU0DQiVjmuVV_IOh15_QlymbVlC3rIUxtEob-4uYnkdrK2lZa7YWs1lrBIkLJy4FTA1OyvPidUq58fLm94bmAXA_n5pnFjtmtS2cKTheYj1bGXoXbxqGl-v-fThtHTi9Y3RokSLuFC9zgioqqaEjixqx3ZMZiMrBDKWshg62ugyeosf3mVTp22g8wIZjns7XKrvHpq9mIjMcJRLrefqVYYCvuChzwBtaUZqcoUnvadztO3ZqgiMIpWa2fNxmjOiYcSIgu4koQBRfSU4eZBCIdmx-tEHiC7giUIIZGypFJQkipakHeek5qiZpoi4Q9pkvY8k1MVcdTIjApZIL4vke1ZDn-HXkluYKhW0pbiZbzMOSO_YRGhOHxsRGBiauo7uNzqJoqLFzNSu_QvjLLUKI-Dv0Lv-pd4X24ckv7liuUXX5uVI3gDqWcSN3qwba6zy_9oc_UXPUGQ
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV07T8NADLYQDMDAG1GeN8AEoU3vLo-BAfFQgZaFgthCcw-pqEoQ4SEW_hR_ELu5VCAhBiRWJ1YuPsuPu882wHaAIbzhMvKM4Q1PWNHzYmusZ0XMhZbSymEVf-cyaF2L81t5OwYfVS0MwSqd7S9t-tBaO0rdSbP-0O_XrzBxCbkvMKZGJUXH6JCVF-btFfO24uDsGDd5p9k8PeketTw3WsBTXAgawJ42pOQ2lSFXmAL0BPrxyOiQN3Ucq0ZKt1GYHaLzjY2S9CFjhOZxz1gRSOp2gHZ_QqC5oLEJ--8jXAnmF8MxfbQ6j5bnKnVKUJl66T8TnizcJ0iZ7__sDb94uNM5mHGhKTss_34exky2ALMuTGXOCBRIqiZBVLQFmP7S1nARDm762uTtq26HEag9H7yYYo9RI89MF6yXaWYH-WvBMF5mZV0FG-GY8mwJrv9FlsswnuWZWQEWyUinOrSczlakUrEvdKh4EAXCYmIV1cCvxJUo18OcRmkMkgqsdp-QiBMSMdFQxDXYHfE8lB08fn1bVruQfNPDBF3ML3yrf-TbgslWt9NO2meXF2swhU-i8oBnHcafHp_NBoY8T-nmUMUY3P23Tn8CPNYNeg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=VideoLSTM+convolves%2C+attends+and+flows+for+action+recognition&rft.jtitle=Computer+vision+and+image+understanding&rft.au=Li%2C+Zhenyang&rft.au=Gavrilyuk%2C+Kirill&rft.au=Gavves%2C+Efstratios&rft.au=Jain%2C+Mihir&rft.date=2018-01-01&rft.issn=1077-3142&rft.volume=166&rft.spage=41&rft.epage=50&rft_id=info:doi/10.1016%2Fj.cviu.2017.10.011&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_cviu_2017_10_011
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1077-3142&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1077-3142&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1077-3142&client=summon