Monocular human pose estimation: A survey of deep learning-based methods

Vision-based monocular human pose estimation, as one of the most fundamental and challenging problems in computer vision, aims to obtain posture of the human body from input images or video sequences. The recent developments of deep learning techniques have been brought significant progress and rema...

Full description

Saved in:

Bibliographic Details
Published in	Computer vision and image understanding Vol. 192; p. 102897
Main Authors	Chen, Yucheng, Tian, Yingli, He, Mingyi
Format	Journal Article
Language	English
Published	Elsevier Inc 01.03.2020
Subjects	Deep learning Human pose estimation Survey Deep learning Survey Human pose estimation 41A10 65D05 65D17 41A05
Online Access	Get full text

Cover

Loading…

Abstract	Vision-based monocular human pose estimation, as one of the most fundamental and challenging problems in computer vision, aims to obtain posture of the human body from input images or video sequences. The recent developments of deep learning techniques have been brought significant progress and remarkable breakthroughs in the field of human pose estimation. This survey extensively reviews the recent deep learning-based 2D and 3D human pose estimation methods published since 2014. This paper summarizes the challenges, main frameworks, benchmark datasets, evaluation metrics, performance comparison, and discusses some promising future research directions.
AbstractList	Vision-based monocular human pose estimation, as one of the most fundamental and challenging problems in computer vision, aims to obtain posture of the human body from input images or video sequences. The recent developments of deep learning techniques have been brought significant progress and remarkable breakthroughs in the field of human pose estimation. This survey extensively reviews the recent deep learning-based 2D and 3D human pose estimation methods published since 2014. This paper summarizes the challenges, main frameworks, benchmark datasets, evaluation metrics, performance comparison, and discusses some promising future research directions.
ArticleNumber	102897
Author	Tian, Yingli He, Mingyi Chen, Yucheng
Author_xml	– sequence: 1 givenname: Yucheng surname: Chen fullname: Chen, Yucheng email: chenyucheng@mail.nwpu.edu.cn organization: Northwestern Polytechnical University, Xi’an, 710072, China – sequence: 2 givenname: Yingli surname: Tian fullname: Tian, Yingli email: ytian@ccny.cuny.edu organization: The City College, City University of New York, NY 10031, USA – sequence: 3 givenname: Mingyi surname: He fullname: He, Mingyi email: myhe@nwpu.edu.cn organization: Northwestern Polytechnical University, Xi’an, 710072, China
BookMark	eNp9kM9KAzEQh4NUsK2-gKe8wNb82W52xUsp2goVLwreQjaZ2JRtUpLdQt_eXevJQ2FghoFvmN83QSMfPCB0T8mMElo87Gb66LoZI7TqF6ysxBUaU1KRjPH512iYhcg4zdkNmqS0I4TSvKJjtH4LPuiuURFvu73y-BASYEit26vWBf-IFzh18QgnHCw2AAfcgIre-e-sVgkM3kO7DSbdomurmgR3f32KPl-eP5brbPO-el0uNpnmed5mIIitdF_EKFHXpDYlt1CzEiwRFcx5XtCCFSzXFYg5MVQUXPWBeG2sVTXhU1Se7-oYUopgpXbt76ttVK6RlMjBiNzJwYgcjMizkR5l_9BD7GPG02Xo6QxBH-roIMqkHXgNxkXQrTTBXcJ_ADOhfY0
CitedBy_id	crossref_primary_10_1109_JIOT_2024_3484755 crossref_primary_10_1016_j_imavis_2025_105437 crossref_primary_10_1145_3533384 crossref_primary_10_1007_s11760_024_03028_0 crossref_primary_10_3390_s23094465 crossref_primary_10_1515_jisys_2022_0060 crossref_primary_10_3390_s23229259 crossref_primary_10_1109_LGRS_2022_3215729 crossref_primary_10_20965_jaciii_2021_p0432 crossref_primary_10_1109_ACCESS_2020_3010248 crossref_primary_10_32604_cmc_2022_021107 crossref_primary_10_3390_e24010077 crossref_primary_10_1007_s42979_022_01618_8 crossref_primary_10_7717_peerj_12995 crossref_primary_10_3390_s22041615 crossref_primary_10_1016_j_eswa_2021_115498 crossref_primary_10_32604_cmc_2023_035904 crossref_primary_10_1038_s41593_020_00734_z crossref_primary_10_1109_ACCESS_2024_3423765 crossref_primary_10_1109_JSEN_2021_3107361 crossref_primary_10_1109_THMS_2022_3219242 crossref_primary_10_1016_j_dsp_2022_103628 crossref_primary_10_1155_2021_9940126 crossref_primary_10_3390_s23010092 crossref_primary_10_3390_sports9090118 crossref_primary_10_3390_s21185996 crossref_primary_10_1016_j_media_2022_102435 crossref_primary_10_1016_j_jvcir_2020_102948 crossref_primary_10_1080_0952813X_2021_1938696 crossref_primary_10_3390_s22072481 crossref_primary_10_1016_j_cviu_2021_103275 crossref_primary_10_3390_eng4040155 crossref_primary_10_1016_j_neucom_2023_126388 crossref_primary_10_1109_TRO_2024_3353550 crossref_primary_10_1016_j_cviu_2025_104297 crossref_primary_10_3390_s22228738 crossref_primary_10_1016_j_cag_2023_07_005 crossref_primary_10_1109_ACCESS_2021_3110610 crossref_primary_10_1109_TETCI_2024_3358103 crossref_primary_10_1016_j_jmapro_2024_03_006 crossref_primary_10_3390_children12030310 crossref_primary_10_1007_s11277_024_11116_0 crossref_primary_10_1109_JBHI_2024_3384453 crossref_primary_10_1016_j_eswa_2024_123625 crossref_primary_10_3390_app11094241 crossref_primary_10_1007_s00530_022_01019_0 crossref_primary_10_1007_s11042_020_10103_4 crossref_primary_10_1016_j_rcim_2023_102691 crossref_primary_10_1109_ACCESS_2023_3234421 crossref_primary_10_1109_TNNLS_2023_3314031 crossref_primary_10_3390_agronomy11081500 crossref_primary_10_1109_JIOT_2023_3294955 crossref_primary_10_1016_j_visinf_2021_10_003 crossref_primary_10_1038_s41598_022_13220_2 crossref_primary_10_3390_s21196530 crossref_primary_10_1145_3528223_3530106 crossref_primary_10_1109_ACCESS_2020_3037736 crossref_primary_10_1155_2022_6040371 crossref_primary_10_1016_j_jksuci_2024_102161 crossref_primary_10_1109_TPAMI_2023_3330935 crossref_primary_10_1007_s00521_021_06181_6 crossref_primary_10_2139_ssrn_4108636 crossref_primary_10_1007_s12283_023_00419_3 crossref_primary_10_1364_BOE_489271 crossref_primary_10_1109_TBIOM_2020_3037257 crossref_primary_10_20948_prepr_2021_109 crossref_primary_10_1038_s41598_023_41221_2 crossref_primary_10_1007_s11548_022_02762_5 crossref_primary_10_3390_s25030627 crossref_primary_10_3390_electronics14061078 crossref_primary_10_1016_j_compag_2021_106622 crossref_primary_10_1109_JBHI_2021_3107532 crossref_primary_10_1063_5_0076398 crossref_primary_10_1016_j_neucom_2023_126827 crossref_primary_10_1016_j_bspc_2022_104479 crossref_primary_10_1016_j_displa_2024_102723 crossref_primary_10_1109_ACCESS_2023_3240769 crossref_primary_10_1016_j_jer_2025_01_007 crossref_primary_10_1109_ACCESS_2022_3177623 crossref_primary_10_1007_s11042_021_10866_4 crossref_primary_10_1016_j_cviu_2022_103539 crossref_primary_10_1109_ACCESS_2024_3399222 crossref_primary_10_1155_2021_1333250 crossref_primary_10_1109_ACCESS_2023_3307138 crossref_primary_10_3390_app11041826 crossref_primary_10_3390_jimaging9100204 crossref_primary_10_1007_s11042_023_16225_9 crossref_primary_10_1109_TPAMI_2023_3298850 crossref_primary_10_3390_make5040081 crossref_primary_10_1016_j_dsp_2021_103056 crossref_primary_10_1016_j_eswa_2022_119139 crossref_primary_10_1007_s00371_023_02957_0 crossref_primary_10_1109_TNNLS_2021_3131406 crossref_primary_10_3390_s21248387 crossref_primary_10_3390_s23198061 crossref_primary_10_1007_s11042_023_17923_0 crossref_primary_10_1016_j_measurement_2023_113361 crossref_primary_10_3390_s22020632 crossref_primary_10_1016_j_eswa_2023_122391 crossref_primary_10_3390_s20164414 crossref_primary_10_1007_s11227_021_04184_7 crossref_primary_10_1016_j_cviu_2021_103225 crossref_primary_10_1007_s00530_022_00980_0 crossref_primary_10_1007_s00530_024_01390_0 crossref_primary_10_1186_s12984_022_00998_5 crossref_primary_10_3390_s23104800 crossref_primary_10_1109_ACCESS_2022_3214659 crossref_primary_10_3390_su151813363 crossref_primary_10_1016_j_ymssp_2021_108482 crossref_primary_10_1371_journal_pone_0310831 crossref_primary_10_1007_s11760_022_02271_7 crossref_primary_10_1016_j_jbiomech_2023_111576 crossref_primary_10_3390_app13063614 crossref_primary_10_1155_2022_4437446 crossref_primary_10_3390_s24196187 crossref_primary_10_3390_electronics13071300 crossref_primary_10_1007_s11276_023_03427_0 crossref_primary_10_3390_math11122665 crossref_primary_10_1007_s12555_023_0686_y crossref_primary_10_3390_s21134550 crossref_primary_10_1007_s10055_024_01044_6 crossref_primary_10_3390_en16031078 crossref_primary_10_1016_j_aei_2022_101717 crossref_primary_10_1007_s11227_021_03684_w crossref_primary_10_38124_ijisrt_IJISRT24JUN071 crossref_primary_10_3390_app12094165 crossref_primary_10_1007_s10919_023_00450_9 crossref_primary_10_1145_3604279 crossref_primary_10_1016_j_imavis_2021_104198 crossref_primary_10_1016_j_cag_2024_103926 crossref_primary_10_1109_JSEN_2023_3315849 crossref_primary_10_1007_s13735_022_00261_6 crossref_primary_10_1038_s41586_020_2669_y crossref_primary_10_1145_3524497 crossref_primary_10_1109_ACCESS_2024_3517417 crossref_primary_10_1007_s12369_020_00739_5 crossref_primary_10_3390_s23177312 crossref_primary_10_1016_j_engappai_2022_105813 crossref_primary_10_3233_JIFS_233501 crossref_primary_10_1109_TR_2020_3030952 crossref_primary_10_1016_j_displa_2024_102675 crossref_primary_10_3390_app13042700 crossref_primary_10_3390_sym14020385 crossref_primary_10_1016_j_inffus_2023_102209 crossref_primary_10_1016_j_patcog_2024_111334 crossref_primary_10_1002_cav_2164 crossref_primary_10_1109_TGRS_2022_3162333 crossref_primary_10_3390_a13120331 crossref_primary_10_1145_3460199 crossref_primary_10_1007_s11554_021_01170_3 crossref_primary_10_1016_j_cviu_2024_104051 crossref_primary_10_3389_fbioe_2024_1520831 crossref_primary_10_1016_j_jksuci_2023_101615 crossref_primary_10_1145_3545993 crossref_primary_10_3390_s20236940 crossref_primary_10_3390_s22145419 crossref_primary_10_1016_j_engappai_2021_104260 crossref_primary_10_1007_s10462_022_10174_9 crossref_primary_10_3390_technologies10020047 crossref_primary_10_3390_s24247983 crossref_primary_10_1109_LRA_2023_3296930 crossref_primary_10_1080_21681163_2023_2292067 crossref_primary_10_1145_3715093 crossref_primary_10_3390_s22051729 crossref_primary_10_3390_s22197215 crossref_primary_10_1016_j_engstruct_2024_117736 crossref_primary_10_1109_ACCESS_2020_3025413 crossref_primary_10_3390_mti6070048 crossref_primary_10_3390_s20154257 crossref_primary_10_1109_JSEN_2024_3510728 crossref_primary_10_1109_ACCESS_2022_3183232 crossref_primary_10_3390_forecast3020020 crossref_primary_10_3390_electronics10182267 crossref_primary_10_1007_s10462_022_10142_3 crossref_primary_10_1016_j_artmed_2024_102945 crossref_primary_10_3390_healthcare11040507 crossref_primary_10_3390_buildings14103174 crossref_primary_10_1007_s00500_021_06156_8 crossref_primary_10_1007_s00530_022_00919_5 crossref_primary_10_1109_TNSRE_2021_3138185 crossref_primary_10_1016_j_matpr_2021_04_234 crossref_primary_10_1007_s10846_021_01560_6 crossref_primary_10_32604_cmes_2023_030677 crossref_primary_10_1016_j_neucom_2021_12_007 crossref_primary_10_1007_s10489_022_03623_z crossref_primary_10_1049_itr2_12314 crossref_primary_10_1016_j_atech_2023_100359 crossref_primary_10_1109_LAWP_2021_3138512 crossref_primary_10_3934_era_2024055 crossref_primary_10_1016_j_neucom_2025_129777 crossref_primary_10_3390_app122010591 crossref_primary_10_3758_s13428_024_02511_3 crossref_primary_10_1016_j_media_2023_102887 crossref_primary_10_3390_s22134846 crossref_primary_10_1016_j_gaitpost_2024_06_007 crossref_primary_10_3390_s22134722 crossref_primary_10_1109_JRFID_2022_3140256 crossref_primary_10_1007_s40747_024_01370_x crossref_primary_10_1007_s42452_023_05581_8 crossref_primary_10_3389_fresc_2023_1130847 crossref_primary_10_1016_j_displa_2022_102308
Cites_doi	10.1109/CVPR.2017.143 10.1109/CVPR.2017.142 10.1109/TPAMI.2017.2782743 10.1109/TPAMI.2016.2522398 10.1007/s11042-018-5642-0 10.1109/CVPR.2018.00055 10.1016/j.patrec.2013.02.006 10.1109/ICCV.2015.222 10.1007/978-3-030-01264-9_17 10.1109/CVPR.2017.106 10.1109/CVPR.2017.501 10.1016/j.cviu.2006.08.002 10.1109/CVPR.2014.458 10.1007/978-3-030-01249-6_37 10.1109/TMM.2017.2762010 10.1109/CVPR.2012.6248098 10.1016/j.cag.2019.09.002 10.1109/CVPR.2016.335 10.1109/CVPR.2018.00744 10.1109/ICCV.2017.144 10.1006/cviu.2000.0897 10.1109/ICCV.2017.137 10.1109/TSMCC.2004.829274 10.1109/CVPR.2018.00763 10.5244/C.23.3 10.1109/CVPR.2018.00742 10.1109/TSMCC.2009.2027608 10.1109/CVPR.2014.299 10.1109/CVPR.2017.500 10.1109/CVPR.2014.491 10.1109/CVPR.2016.334 10.5244/C.24.12 10.1109/CVPR.2015.7298664 10.1109/JSTSP.2012.2196975 10.1109/ICCV.2017.284 10.1006/cviu.1998.0716 10.1109/CVPR.2016.533 10.1109/CVPR.2011.5995607 10.1109/ICCV.2015.326 10.1109/CVPR.2015.7298976 10.1109/ICCV.2007.4408872 10.1007/978-3-030-01231-1_29 10.1006/cviu.1995.1004 10.1109/ICCV.2017.256 10.5244/C.30.109 10.1109/CVPR.2018.00551 10.1109/ICCV.2013.280 10.1109/CVPR.2019.00465 10.1109/CVPR.2017.610 10.1109/CVPR.2017.139 10.1109/CVPR.2017.170 10.1007/978-3-030-01249-6_46 10.1109/ICCV.2013.396 10.1109/CVPR.2018.00229 10.1109/CVPR.2018.00768 10.1109/ICCV.2017.51 10.1145/3072959.3073596 10.1109/CVPR.2017.591 10.1016/j.patcog.2018.05.029 10.1109/CVPR.2011.5995519 10.1109/TPAMI.2013.248 10.1007/s11263-012-0564-1 10.1109/CVPR.2017.603 10.1109/AVSS.2018.8639378 10.1109/CVPR.2013.471 10.1109/CVPR.2018.00539 10.1109/CVPR.2019.00796 10.1109/CVPR.2016.115 10.3390/s140304189 10.1109/TPAMI.2016.2557779 10.1109/CVPR.2017.492 10.1109/CVPRW.2014.78 10.1109/CVPR.2019.00351 10.1109/CVPR.2008.4587468 10.5244/C.31.15 10.1145/2816795.2818013 10.1249/MSS.0b013e31821ece12 10.1109/ICCV.2017.329 10.1016/j.jvcir.2015.06.013 10.1109/CVPR.2018.00868 10.1007/978-3-030-01219-9_12 10.1109/ICCV.2017.373 10.1007/978-3-030-01234-2_2 10.3390/s16121966 10.23919/APSIPA.2018.8659538 10.1109/ICCV.2017.425 10.1023/B:VISI.0000042934.15159.49 10.1109/CVPR.2018.00237 10.1007/978-3-030-01228-1_42 10.1109/ICCV.2017.288 10.1109/CVPR.2018.00542 10.1007/s11263-009-0275-4 10.1109/CVPR.2018.00762 10.1109/CVPR.2011.5995318 10.1109/ICIP.2018.8451114 10.1109/CVPR.2015.7298965 10.1016/j.cviu.2006.10.016 10.5244/C.31.14 10.1109/CVPR.2013.391 10.1109/TPAMI.2012.241 10.1109/CVPR.2012.6247801 10.1145/2766993 10.1109/CVPR.2019.00120 10.1109/CVPR.2016.308 10.1109/CVPR.2019.01012 10.1109/CVPR.2016.511 10.1007/s11263-009-0273-6 10.1016/j.cviu.2018.04.007 10.1109/CVPR.2014.471 10.1109/CVPR.2018.00880 10.1109/TPAMI.2012.85 10.1109/CVPR.2017.134 10.1109/CVPR.2019.00584 10.1109/CVPR.2019.01225 10.1006/cviu.1998.0744 10.1109/CVPR.2016.512 10.1007/BF02291478 10.1016/j.cviu.2016.09.002 10.1109/CVPR.2017.395 10.1007/978-3-030-01219-9_21 10.1109/CVPR.2016.510 10.1109/CVPR.2017.601 10.1109/CVPR.2018.00546 10.1007/s11263-013-0672-6 10.1109/CVPR.2013.429 10.1109/CVPR.2017.495 10.1007/978-3-030-01225-0_27 10.1109/3DV.2018.00024 10.1109/CVPR.2014.214 10.1109/TPAMI.2012.261
ContentType	Journal Article
Copyright	2020 Elsevier Inc.
Copyright_xml	– notice: 2020 Elsevier Inc.
DBID	AAYXX CITATION
DOI	10.1016/j.cviu.2019.102897
DatabaseName	CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
Discipline	Applied Sciences Engineering Computer Science
EISSN	1090-235X
ExternalDocumentID	10_1016_j_cviu_2019_102897 S1077314219301778
GroupedDBID	--K --M -~X .DC .~1 0R~ 1B1 1~. 1~5 29F 4.4 457 4G. 5GY 5VS 6TJ 7-5 71M 8P~ AABNK AACTN AAEDT AAEDW AAIAV AAIKC AAIKJ AAKOC AALRI AAMNW AAOAW AAQFI AAQXK AAXUO AAYFN ABBOA ABEFU ABFNM ABJNI ABMAC ABXDB ABYKQ ACDAQ ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADFGL ADJOM ADMUD ADTZH AEBSH AECPX AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CAG COF CS3 DM4 DU5 EBS EFBJH EFLBG EJD EO8 EO9 EP2 EP3 F0J F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-Q GBLVA GBOLZ HF~ HVGLF HZ~ IHE J1W JJJVA KOM LG5 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- RIG RNS ROL RPZ SDF SDG SDP SES SEW SPC SPCBC SSV SSZ T5K TN5 XPP ZMT ~G- AATTM AAXKI AAYWO AAYXX ABWVN ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGCQF AGQPQ AGRNS AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP BNPGV CITATION SST
ID	FETCH-LOGICAL-c344t-e70f9cf9c0da7bb0bd83feb28ef079e5346162624c9e750d1763a0283bdffab03
IEDL.DBID	.~1
ISSN	1077-3142
IngestDate	Tue Jul 01 04:32:07 EDT 2025 Thu Apr 24 23:01:03 EDT 2025 Fri Feb 23 02:48:40 EST 2024
IsDoiOpenAccess	false
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Keywords	Deep learning Survey Human pose estimation 41A10 65D05 65D17 41A05
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c344t-e70f9cf9c0da7bb0bd83feb28ef079e5346162624c9e750d1763a0283bdffab03
OpenAccessLink	https://doi.org/10.1016/j.cviu.2019.102897
ParticipantIDs	crossref_citationtrail_10_1016_j_cviu_2019_102897 crossref_primary_10_1016_j_cviu_2019_102897 elsevier_sciencedirect_doi_10_1016_j_cviu_2019_102897
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	March 2020 2020-03-00
PublicationDateYYYYMMDD	2020-03-01
PublicationDate_xml	– month: 03 year: 2020 text: March 2020
PublicationDecade	2020
PublicationTitle	Computer vision and image understanding
PublicationYear	2020
Publisher	Elsevier Inc
Publisher_xml	– name: Elsevier Inc
References	Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J., 2018. End-to-end recovery of human shape and pose. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 7122–7131. Perez-Sala, Escalera, Angulo, Gonzalez (b128) 2014; 14 Everingham, Van Gool, Williams, Winn, Zisserman (b34) 2010; 88 Toshev, A., Szegedy, C., 2014. Deeppose: Human pose estimation via deep neural networks. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1653–1660. Xiao, B., Wu, H., Wei, Y., 2018. Simple baselines for human pose estimation and tracking. In: Proc. European Conference on Computer Vision, pp. 466–481. Jain, Tompson, LeCun, Bregler (b64) 2014 Popa, A.I., Zanfir, M., Sminchisescu, C., 2017. Deep multitask architecture for integrated 2d and 3d human sensing. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 4714–4723. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z., 2016. Rethinking the inception architecture for computer vision. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826. Li, S., Liu, Z.Q., Chan, A.B., 2014. Heterogeneous multi-task learning for human pose estimation with deep convolutional neural network. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 482–489. Sapp, B., Taskar, B., 2013. Modec: Multimodal decomposable models for human pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3674–3681. Bogo, F., Romero, J., Loper, M., Black, M.J., 2014. FAUST: Dataset and evaluation for 3D mesh registration. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3794–3801. Tekin, B., Marque. Neila, P., Salzmann, M., Fua, P., 2017. Learning to fuse 2d and 3d image cues for monocular body pose estimation. In: Proc. IEEE International Conference on Computer Vision, pp. 3941–3950. Ouyang, W., Chu, X., Wang, X., 2014. Multi-source deep learning for human pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2329–2336. Peng, X., Tang, Z., Yang, F., Feris, R.S., Metaxas, D., 2018. Jointly optimize data augmentation and network training: Adversarial data augmentation in human pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2226–2234. Chen, Wei, Ferryman (b21) 2013; 34 Rohrbach, M., Amin, S., Andriluka, M., Schiele, B., 2012. A database for fine grained activity detection of cooking activities. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1194–1201. Ferrari, V., Marin-Jimenez, M., Zisserman, A., 2008. Progressive search space reduction for human pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. Gkioxari, G., Hariharan, B., Girshick, R., Malik, J., 2014b. Using k-poselets for detecting people and localizing their keypoints. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3582–3589. Ning, Zhang, He (b119) 2018; 20 Tekin, Katircioglu, Salzmann, Lepetit, Fua (b159) 2016 Rogez, G., Weinzaepfel, P., Schmid, C., 2017. Lcr-net: Localization-classification-regression for human pose. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3433–3441. Wang, Y., Tran, D., Liao, Z., 2011. Learning hierarchical poselets for human parsing. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1705–1712. Chou, C.J., Chien, J.T., Chen, H.T., 2018. Self adversarial training for human pose estimation. In: Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, pp. 17-30. Gkioxari, Toshev, Jaitly (b46) 2016 Luvizon, D.C., Picard, D., Tabia, H., 2018. 2d/3d pose estimation and action recognition using multitask deep learning. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 5137–5146. Faessler, Mueggler, Schwabe, Scaramuzza (b36) 2014 Ren, He, Girshick, Sun (b138) 2015 Yang, Ramanan (b180) 2013; 35 Elhayek, d. Aguiar, Jain, Thompson, Pishchulin, Andriluka, Bregler, Schiele, Theobalt (b33) 2017; 39 Papandreou, G., Zhu, T., Kanazawa, N., Toshev, A., Tompson, J., Bregler, C., Murphy, K., 2017. Towards accurate multi-person pose estimation in the wild. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 4903–4911. Moeslund, Granum (b109) 2001; 81 Vondrick, Patterson, Ramanan (b170) 2013; 101 Debnath, B., O’Brien, M., Yamaguchi, M., Behera, A., 2018. Adapting mobilenets for mobile based upper body pose estimation. In: Proc. IEEE Conference on Advanced Video and Signal Based Surveillance, pp. 1–6. Lin, Maire, Belongie, Hays, Perona, Ramanan, Dollár (b93) 2014 (b56) 2019 Sapp, B., Weiss, D., Taskar, B., 2011. Parsing human motion with stretchable models. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1281–1288. Loper, Mahmood, Romero, Pons-Moll, Black (b96) 2015; 34 Arnab, A., Doersch, C., Zisserman, A., 2019. Exploiting temporal context for 3d human pose estimation in the wild. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3395–3404. Felzenszwalb, Huttenlocher (b39) 2005; 61 Varol, G., Romero, J., Martin, X., Mahmood, N., Black, M.J., Laptev, I., Schmid, C., 2017. Learning from synthetic humans. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 4627–4635. Chen, Yuille (b22) 2014 Pfister, Simonyan, Charles, Zisserman (b130) 2014 Holte, Tran, Trivedi, Moeslund (b52) 2012; 6 Li, B., Dai, Y., Cheng, X., Chen, H., Lin, Y., He, M., 2017b. Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep cnn. In: Proc. IEEE International Conference on Multimedia and Expo Workshops, pp. 601–604. Mahmood, Ghorbani, Troje, Pons-Moll, Black (b100) 2019 Iqbal, Gall (b60) 2016 Zhang, W., Zhu, M., Derpanis, K.G., 2013. From actemes to action: A strongly-supervised representation for detailed action understanding. In: Proc. IEEE International Conference on Computer Vision, pp. 2248–2255. Newell, Huang, Deng (b114) 2017 Pons-Moll, Romero, Mahmood, Black (b132) 2015; 34 Ju, S.X., Black, M.J., Yacoob, Y., 1996. Cardboard people: A parameterized model of articulated image motion. In: Proc. IEEE Conference on Automatic Face and Gesture Recognition, pp. 38–44. Ainsworth, Haskell, Herrmann, Meckes, Basset Jr., Tudor-Locke, Greer, Vezina, Whitt-Glover, Leon (b2) 2011; 43 Fan, X., Zheng, K., Lin, Y., Wang, S., 2015. Combining local appearance and holistic view: Dual-source deep neural networks for human pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1347-1355. Andriluka, M., Iqbal, U., Milan, A., Insafutdinov, E., Pishchulin, L., Gall, J., Schiele, B., 2018. Posetrack: A benchmark for human pose estimation and tracking. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 5167–5176. Papandreou, G., Zhu, T., Chen, L.C., Gidaris, S., Tompson, J., Murphy, K., 2018. Personlab: Person pose estimation and instance segmentation with a bottom-up, part-based, geometric embedding model. In: Proc. European Conference on Computer Vision, pp. 269-286. Luo, Y., Ren, J., Wang, Z., Sun, W., Pan, J., Liu, J., Pang, J., Lin, L., 2018. Lstm pose machines. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 5207–5215. Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B., 2016. Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937. Shahroudy, A., Liu, J., Ng, T.T., Wang, G., 2016. Ntu rgb+ d: A large scale dataset for 3d human activity analysis. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1010–1019. Jain, Tompson, Andriluka, Taylor, Bregler (b63) 2013 Pavlakos, G., Zhou, X., Derpanis, K.G., Daniilidis, K., 2017. Coarse-to-fine volumetric prediction for single-image 3d human pose. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1263–1272. Insafutdinov, Pishchulin, Andres, Andriluka, Schiele (b58) 2016 Howard, Zhu, Chen, Kalenichenko, Wang, Weyand, Andreetto, Adam (b53) 2017 Eichner, Ferrari (b29) 2010 Wang, Chen, Liu, Qian, Lin, Ma (b171) 2018 Omran, Lassner, Pons-Moll, Gehler, Schiele (b120) 2018 Tompson, J., Goroshin, R., Jain, A., LeCun, Y., Bregler, C., 2015. Efficient object localization using convolutional networks. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 648–656. Yang, W., Li, S., Ouyang, W., Li, H., Wang, X., 2017. Learning feature pyramids for human pose estimation. In: Proc. IEEE International Conference on Computer Vision, pp. 1281–1290. Bourdev, Malik (b11) 2009 Chu, X., Ouyang, W., Li, H., Wang, X., 2016. Structured feature learning for pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 4715–4723. Hasler, Stoll, Sunkel, Rosenhahn, Seidel (b50) 2009 Pavlakos, G., Zhu, L., Zhou, X., Daniilidis, K., 2018b. Learning to estimate 3D human pose and shape from a single color image. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 459-468. Nibali, He, Morgan, Prendergast (b116) 2018 Ionescu, Papava, Olaru, Sminchisescu (b59) 2014; 36 Bogo, F., Romero, J., Pons-Moll, G., Black, M.J., 2017. Dynamic FAUST: Registering human bodies in motion. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 6233–6242. Moeslund, Hilton, Krüger (b110) 2006; 104 Tang, Z., Peng, X., Geng, S., Wu, L., Zhang, S., Metaxas, D., 2018b. Quantized densely connected u-nets for efficient landmark localization. In: Proc. European Conference on Computer Vision, pp. 339–354. Jhuang, H., Gall, J., Zuffi, S., Schmid, C., Black, M.J., 2013. Towards understanding action recognition. In: Proc. IEEE International Conference on Computer Vision, pp. 3192–3199. Sidenbladh, H., De la Torre, F., Black, M.J., 2000. A framework for modeling the appearance of 3d articulated figures. In: Proc. IEEE Conferen 10.1016/j.cviu.2019.102897_b168 10.1016/j.cviu.2019.102897_b165 10.1016/j.cviu.2019.102897_b166 10.1016/j.cviu.2019.102897_b167 Wu (10.1016/j.cviu.2019.102897_b175) 2017 10.1016/j.cviu.2019.102897_b28 Sminchisescu (10.1016/j.cviu.2019.102897_b150) 2008 10.1016/j.cviu.2019.102897_b27 Belagiannis (10.1016/j.cviu.2019.102897_b7) 2017 Newell (10.1016/j.cviu.2019.102897_b114) 2017 10.1016/j.cviu.2019.102897_b25 10.1016/j.cviu.2019.102897_b24 10.1016/j.cviu.2019.102897_b23 Loper (10.1016/j.cviu.2019.102897_b96) 2015; 34 10.1016/j.cviu.2019.102897_b32 Moeslund (10.1016/j.cviu.2019.102897_b111) 2011 Mehta (10.1016/j.cviu.2019.102897_b107) 2017; 36 10.1016/j.cviu.2019.102897_b160 10.1016/j.cviu.2019.102897_b162 10.1016/j.cviu.2019.102897_b163 10.1016/j.cviu.2019.102897_b179 Jain (10.1016/j.cviu.2019.102897_b63) 2013 10.1016/j.cviu.2019.102897_b176 10.1016/j.cviu.2019.102897_b177 10.1016/j.cviu.2019.102897_b178 10.1016/j.cviu.2019.102897_b38 10.1016/j.cviu.2019.102897_b37 Ji (10.1016/j.cviu.2019.102897_b67) 2010; 40 Chen (10.1016/j.cviu.2019.102897_b19) 2016 10.1016/j.cviu.2019.102897_b35 Felzenszwalb (10.1016/j.cviu.2019.102897_b39) 2005; 61 10.1016/j.cviu.2019.102897_b43 10.1016/j.cviu.2019.102897_b41 Elhayek (10.1016/j.cviu.2019.102897_b33) 2017; 39 10.1016/j.cviu.2019.102897_b40 Mehta (10.1016/j.cviu.2019.102897_b104) 2017 10.1016/j.cviu.2019.102897_b173 10.1016/j.cviu.2019.102897_b174 Eichner (10.1016/j.cviu.2019.102897_b31) 2012; 34 10.1016/j.cviu.2019.102897_b146 10.1016/j.cviu.2019.102897_b148 (10.1016/j.cviu.2019.102897_b169) 2019 10.1016/j.cviu.2019.102897_b142 10.1016/j.cviu.2019.102897_b143 Shotton (10.1016/j.cviu.2019.102897_b147) 2012; 35 10.1016/j.cviu.2019.102897_b144 Luvizon (10.1016/j.cviu.2019.102897_b99) 2019; 85 10.1016/j.cviu.2019.102897_b49 Ramakrishna (10.1016/j.cviu.2019.102897_b137) 2014 10.1016/j.cviu.2019.102897_b45 10.1016/j.cviu.2019.102897_b55 Li (10.1016/j.cviu.2019.102897_b83) 2018; 83 Cootes (10.1016/j.cviu.2019.102897_b26) 1995; 61 Sigal (10.1016/j.cviu.2019.102897_b149) 2010; 87 Nibali (10.1016/j.cviu.2019.102897_b116) 2018 Gong (10.1016/j.cviu.2019.102897_b47) 2016; 16 Chen (10.1016/j.cviu.2019.102897_b21) 2013; 34 Krizhevsky (10.1016/j.cviu.2019.102897_b78) 2012 Omran (10.1016/j.cviu.2019.102897_b120) 2018 10.1016/j.cviu.2019.102897_b140 10.1016/j.cviu.2019.102897_b141 Howard (10.1016/j.cviu.2019.102897_b53) 2017 Perez-Sala (10.1016/j.cviu.2019.102897_b128) 2014; 14 10.1016/j.cviu.2019.102897_b157 10.1016/j.cviu.2019.102897_b158 10.1016/j.cviu.2019.102897_b153 Wang (10.1016/j.cviu.2019.102897_b172) 2018; 171 10.1016/j.cviu.2019.102897_b154 Pons-Moll (10.1016/j.cviu.2019.102897_b132) 2015; 34 10.1016/j.cviu.2019.102897_b155 10.1016/j.cviu.2019.102897_b156 Lifshitz (10.1016/j.cviu.2019.102897_b91) 2016 (10.1016/j.cviu.2019.102897_b56) 2019 10.1016/j.cviu.2019.102897_b57 Anguelov (10.1016/j.cviu.2019.102897_b5) 2005 10.1016/j.cviu.2019.102897_b66 10.1016/j.cviu.2019.102897_b65 10.1016/j.cviu.2019.102897_b61 Newell (10.1016/j.cviu.2019.102897_b115) 2016 Joo (10.1016/j.cviu.2019.102897_b70) 2017; 41 10.1016/j.cviu.2019.102897_b151 10.1016/j.cviu.2019.102897_b152 Faessler (10.1016/j.cviu.2019.102897_b36) 2014 Liu (10.1016/j.cviu.2019.102897_b94) 2015; 32 Sarafianos (10.1016/j.cviu.2019.102897_b145) 2016; 152 10.1016/j.cviu.2019.102897_b124 10.1016/j.cviu.2019.102897_b125 Jain (10.1016/j.cviu.2019.102897_b64) 2014 10.1016/j.cviu.2019.102897_b126 10.1016/j.cviu.2019.102897_b127 Aggarwal (10.1016/j.cviu.2019.102897_b1) 1999; 73 10.1016/j.cviu.2019.102897_b121 10.1016/j.cviu.2019.102897_b122 10.1016/j.cviu.2019.102897_b123 Moeslund (10.1016/j.cviu.2019.102897_b109) 2001; 81 Tekin (10.1016/j.cviu.2019.102897_b159) 2016 10.1016/j.cviu.2019.102897_b69 10.1016/j.cviu.2019.102897_b129 Eichner (10.1016/j.cviu.2019.102897_b30) 2012 10.1016/j.cviu.2019.102897_b68 10.1016/j.cviu.2019.102897_b77 10.1016/j.cviu.2019.102897_b74 10.1016/j.cviu.2019.102897_b73 (10.1016/j.cviu.2019.102897_b161) 2019 Yang (10.1016/j.cviu.2019.102897_b180) 2013; 35 Gavrila (10.1016/j.cviu.2019.102897_b42) 1999; 73 10.1016/j.cviu.2019.102897_b72 10.1016/j.cviu.2019.102897_b71 Lin (10.1016/j.cviu.2019.102897_b93) 2014 Li (10.1016/j.cviu.2019.102897_b80) 2014 Moeslund (10.1016/j.cviu.2019.102897_b110) 2006; 104 Gower (10.1016/j.cviu.2019.102897_b48) 1975; 40 Hu (10.1016/j.cviu.2019.102897_b54) 2004; 34 Holte (10.1016/j.cviu.2019.102897_b52) 2012; 6 Ning (10.1016/j.cviu.2019.102897_b119) 2018; 20 10.1016/j.cviu.2019.102897_b135 10.1016/j.cviu.2019.102897_b136 10.1016/j.cviu.2019.102897_b131 10.1016/j.cviu.2019.102897_b133 Gkioxari (10.1016/j.cviu.2019.102897_b44) 2014 Tompson (10.1016/j.cviu.2019.102897_b164) 2014 10.1016/j.cviu.2019.102897_b139 Pfister (10.1016/j.cviu.2019.102897_b130) 2014 Vondrick (10.1016/j.cviu.2019.102897_b170) 2013; 101 Bourdev (10.1016/j.cviu.2019.102897_b11) 2009 10.1016/j.cviu.2019.102897_b79 Poppe (10.1016/j.cviu.2019.102897_b134) 2007; 108 10.1016/j.cviu.2019.102897_b88 Bulat (10.1016/j.cviu.2019.102897_b12) 2016 10.1016/j.cviu.2019.102897_b87 Hasler (10.1016/j.cviu.2019.102897_b50) 2009 Ionescu (10.1016/j.cviu.2019.102897_b59) 2014; 36 10.1016/j.cviu.2019.102897_b85 10.1016/j.cviu.2019.102897_b84 10.1016/j.cviu.2019.102897_b82 10.1016/j.cviu.2019.102897_b81 Wang (10.1016/j.cviu.2019.102897_b171) 2018 Everingham (10.1016/j.cviu.2019.102897_b34) 2010; 88 Li (10.1016/j.cviu.2019.102897_b86) 2018; 77 Ainsworth (10.1016/j.cviu.2019.102897_b2) 2011; 43 von Marcard (10.1016/j.cviu.2019.102897_b102) 2016; 38 Zhou (10.1016/j.cviu.2019.102897_b185) 2016 10.1016/j.cviu.2019.102897_b103 10.1016/j.cviu.2019.102897_b186 10.1016/j.cviu.2019.102897_b187 10.1016/j.cviu.2019.102897_b101 (10.1016/j.cviu.2019.102897_b75) 2019 Mahmood (10.1016/j.cviu.2019.102897_b100) 2019 Kocabas (10.1016/j.cviu.2019.102897_b76) 2018 10.1016/j.cviu.2019.102897_b106 Eichner (10.1016/j.cviu.2019.102897_b29) 2010 Jaderberg (10.1016/j.cviu.2019.102897_b62) 2015 10.1016/j.cviu.2019.102897_b89 10.1016/j.cviu.2019.102897_b10 10.1016/j.cviu.2019.102897_b98 10.1016/j.cviu.2019.102897_b97 Mehta (10.1016/j.cviu.2019.102897_b105) 2019 10.1016/j.cviu.2019.102897_b95 10.1016/j.cviu.2019.102897_b92 10.1016/j.cviu.2019.102897_b182 10.1016/j.cviu.2019.102897_b90 10.1016/j.cviu.2019.102897_b183 10.1016/j.cviu.2019.102897_b184 10.1016/j.cviu.2019.102897_b181 Gkioxari (10.1016/j.cviu.2019.102897_b46) 2016 10.1016/j.cviu.2019.102897_b113 Chen (10.1016/j.cviu.2019.102897_b22) 2014 Iqbal (10.1016/j.cviu.2019.102897_b60) 2016 Bogo (10.1016/j.cviu.2019.102897_b8) 2016 10.1016/j.cviu.2019.102897_b112 10.1016/j.cviu.2019.102897_b18 Meredith (10.1016/j.cviu.2019.102897_b108) 2001 10.1016/j.cviu.2019.102897_b17 10.1016/j.cviu.2019.102897_b16 He (10.1016/j.cviu.2019.102897_b51) 2017 10.1016/j.cviu.2019.102897_b117 10.1016/j.cviu.2019.102897_b14 10.1016/j.cviu.2019.102897_b118 10.1016/j.cviu.2019.102897_b13 Charles (10.1016/j.cviu.2019.102897_b15) 2014; 110 Insafutdinov (10.1016/j.cviu.2019.102897_b58) 2016 10.1016/j.cviu.2019.102897_b9 10.1016/j.cviu.2019.102897_b20 Ren (10.1016/j.cviu.2019.102897_b138) 2015 10.1016/j.cviu.2019.102897_b6 10.1016/j.cviu.2019.102897_b4 10.1016/j.cviu.2019.102897_b3
References_xml	– reference: Charles, J., Pfister, T., Magee, D., Hogg, D., Zisserman, A., 2016. Personalizing human video pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3063–3072. – volume: 39 start-page: 501 year: 2017 end-page: 514 ident: b33 article-title: Marconi—convnet-based marker-less motion capture in outdoor and indoor scenes publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – reference: Rogez, G., Weinzaepfel, P., Schmid, C., 2017. Lcr-net: Localization-classification-regression for human pose. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3433–3441. – reference: Chou, C.J., Chien, J.T., Chen, H.T., 2018. Self adversarial training for human pose estimation. In: Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, pp. 17-30. – volume: 41 start-page: 190 year: 2017 end-page: 204 ident: b70 article-title: Panoptic studio: A massively multiview system for social interaction capture publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – reference: Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B., 2014. 2d human pose estimation: New benchmark and state of the art analysis. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3686–3693. – volume: 35 start-page: 2821 year: 2012 end-page: 2840 ident: b147 article-title: Efficient human pose estimation from single depth images publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – year: 2016 ident: b159 article-title: Structured prediction of 3d human pose with deep neural networks – start-page: 2277 year: 2017 end-page: 2287 ident: b114 article-title: Associative embedding: End-to-end learning for joint detection and grouping publication-title: Advances in Neural Information Processing Systems – start-page: 437 year: 2018 end-page: 453 ident: b76 article-title: Multiposenet: Fast multi-person pose estimation using pose residual network publication-title: Proc. European Conference on Computer Vision – reference: Papandreou, G., Zhu, T., Chen, L.C., Gidaris, S., Tompson, J., Murphy, K., 2018. Personlab: Person pose estimation and instance segmentation with a bottom-up, part-based, geometric embedding model. In: Proc. European Conference on Computer Vision, pp. 269-286. – start-page: 468 year: 2017 end-page: 475 ident: b7 article-title: Recurrent human pose estimation publication-title: Proc. IEEE Conference on Automatic Face and Gesture Recognition – reference: Cao, Z., Simon, T., Wei, S.E., Sheikh, Y., 2017. Realtime multi-person 2d pose estimation using part affinity fields. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291-7299. – volume: 83 start-page: 328 year: 2018 end-page: 339 ident: b83 article-title: Monocular depth estimation with hierarchical fusion of dilated cnns and soft-weighted-sum inference publication-title: Pattern Recognit. – volume: 73 start-page: 82 year: 1999 end-page: 98 ident: b42 article-title: The visual analysis of human movement: A survey publication-title: Comput. Vis. Image Underst. – year: 2019 ident: b169 article-title: Vicon – reference: Sun, X., Shang, J., Liang, S., Wei, Y., 2017. Compositional human pose regression. In: Proc. IEEE International Conference on Computer Vision, pp. 2602-2611. – reference: Tome, D., Russell, C., Agapito, L., 2017. Lifting from the deep: Convolutional 3d pose estimation from a single image. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2500-2509. – reference: Zhao, M., Li, T., Ab. Alsheikh, M., Tian, Y., Zhao, H., Torralba, A., Katabi, D., 2018. Through-wall human pose estimation using radio signals. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 7356–7365. – reference: Li, Z., Dekel, T., Cole, F., Tucker, R., Snavely, N., Liu, C., Freeman, W., 2019. Learning the depths of moving people by watching frozen people. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 4521-4530. – reference: Lin, T.Y., Dollár, R., He, K., Hariharan, B., Belongie, S., 2017. Feature pyramid networks for object detection. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125. – start-page: 1799 year: 2014 end-page: 1807 ident: b164 article-title: Joint training of a convolutional network and a graphical model for human pose estimation publication-title: Advances in Neural Information Processing Systems – volume: 16 start-page: 1966 year: 2016 ident: b47 article-title: Human pose estimation from monocular images: A comprehensive survey publication-title: Sensors – year: 2017 ident: b53 article-title: Mobilenets: Efficient convolutional neural networks for mobile vision applications – reference: Kreiss, S., Bertoni, L., Alahi, A., 2019. Pifpaf: Composite fields for human pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 11977–11986. – reference: Varol, G., Romero, J., Martin, X., Mahmood, N., Black, M.J., Laptev, I., Schmid, C., 2017. Learning from synthetic humans. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 4627–4635. – reference: Li, C., Lee, G.H., 2019. Generating multiple hypotheses for 3d human pose estimation with mixture density network. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 9887–9895. – volume: 36 start-page: 1325 year: 2014 end-page: 1339 ident: b59 article-title: Human3.6m: Large scale datasets and predictive methods for 3d human sensing in natural environments publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – reference: Sidenbladh, H., De la Torre, F., Black, M.J., 2000. A framework for modeling the appearance of 3d articulated figures. In: Proc. IEEE Conference on Automatic Face and Gesture Recognition, IEEE, pp. 368–375. – start-page: 302 year: 2014 end-page: 315 ident: b64 article-title: Modeep: A deep learning framework using motion features for human pose estimation publication-title: Proc. Asian Conference on Computer Vision – start-page: 561 year: 2016 end-page: 578 ident: b8 article-title: Keep it smpl: Automatic estimation of 3d human pose and shape from a single image publication-title: Proc. European Conference on Computer Vision – reference: Zanfir, A., Marinoiu, E., Sminchisescu, C., 2018. Monocular 3d pose and shape estimation of multiple people in natural scenes-the importance of multiple scene constraints. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2148–2157. – reference: Popa, A.I., Zanfir, M., Sminchisescu, C., 2017. Deep multitask architecture for integrated 2d and 3d human sensing. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 4714–4723. – start-page: 91 year: 2015 end-page: 99 ident: b138 article-title: Faster r-cnn: Towards real-time object detection with region proposal networks publication-title: Advances in Neural Information Processing Systems – start-page: 228 year: 2010 end-page: 242 ident: b29 article-title: We are family: Joint pose estimation of multiple persons publication-title: Proc. European Conference on Computer Vision – reference: Pavlakos, G., Zhu, L., Zhou, X., Daniilidis, K., 2018b. Learning to estimate 3D human pose and shape from a single color image. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 459-468. – reference: Yang, W., Li, S., Ouyang, W., Li, H., Wang, X., 2017. Learning feature pyramids for human pose estimation. In: Proc. IEEE International Conference on Computer Vision, pp. 1281–1290. – year: 2018 ident: b171 article-title: Drpose3d: Depth ranking in 3d human pose estimation – reference: Eichner, M., Ferrari, V., Zurich, S., 2009. Better appearance models for pictorial structures. In: Proc. British Machine Vision Conference, p. 5. – year: 2019 ident: b105 article-title: Xnect: Real-time multi-person 3d human pose estimation with a single rgb camer – reference: Debnath, B., O’Brien, M., Yamaguchi, M., Behera, A., 2018. Adapting mobilenets for mobile based upper body pose estimation. In: Proc. IEEE Conference on Advanced Video and Signal Based Surveillance, pp. 1–6. – reference: Ke, L., Chang, M.C., Qi, H., Lyu, S., 2018. Multi-scale structure-aware network for human pose estimation. In: Proc. European Conference on Computer Vision, pp. 713-728. – volume: 171 start-page: 118 year: 2018 end-page: 139 ident: b172 article-title: Rgb-d-based human motion recognition with deep learning: A survey publication-title: Comput. Vis. Image Underst. – reference: Yang, W., Ouyang, W., Li, H., Wang, X., 2016. End-to-end learning of deformable mixture of parts and deep convolutional neural networks for human pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3073–3082. – reference: Gkioxari, G., Hariharan, B., Girshick, R., Malik, J., 2014b. Using k-poselets for detecting people and localizing their keypoints. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3582–3589. – reference: Shahroudy, A., Liu, J., Ng, T.T., Wang, G., 2016. Ntu rgb+ d: A large scale dataset for 3d human activity analysis. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1010–1019. – reference: Insafutdinov, E., Andriluka, M., Pishchulin, L., Tang, S., Levinkov, E., Andres, B., Schiele, B., 2017. Arttrack: Articulated multi-person tracking in the wild. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 6457–6465. – volume: 36 start-page: 44 year: 2017 ident: b107 article-title: Vnect: Real-time 3d human pose estimation with a single rgb camera publication-title: ACM Trans. Graph. – start-page: 2980 year: 2017 end-page: 2988 ident: b51 article-title: Mask r-cnn publication-title: Proc. IEEE International Conference on Computer Vision – reference: Feng, Z., Xiatian, Z., Mao, Y., 2019. Fast human pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. – reference: Zhang, W., Zhu, M., Derpanis, K.G., 2013. From actemes to action: A strongly-supervised representation for detailed action understanding. In: Proc. IEEE International Conference on Computer Vision, pp. 2248–2255. – reference: Luvizon, D.C., Picard, D., Tabia, H., 2018. 2d/3d pose estimation and action recognition using multitask deep learning. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 5137–5146. – reference: Johnson, S., Everingham, M., 2011. Learning effective human pose estimation from inaccurate annotation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1465–1472. – reference: Nie, B.X., Wei, P., Zhu, S.C., 2017. Monocular 3d human pose estimation by predicting depth on joints. In: Proc. IEEE International Conference on Computer Vision, pp. 3447–3455. – reference: Tan, J., Budvytis, I., Cipolla, R., 2017. Indirect deep structured learning for 3d human body shape and pose prediction. In: Proc. British Machine Vision Conference. – reference: Güler, R.A., Neverova, N., Kokkinos, I., 2018. Densepose: Dense human pose estimation in the wild. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 7297–7306. – reference: Arnab, A., Doersch, C., Zisserman, A., 2019. Exploiting temporal context for 3d human pose estimation in the wild. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3395–3404. – volume: 87 start-page: 4 year: 2010 ident: b149 article-title: Humaneva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion publication-title: Int. J. Comput. Vis. – reference: Xiao, B., Wu, H., Wei, Y., 2018. Simple baselines for human pose estimation and tracking. In: Proc. European Conference on Computer Vision, pp. 466–481. – reference: Fang, H., Xie, S., Tai, Y.W., Lu, C., 2017. Rmpe: Regional multi-person pose estimation. In: Proc. IEEE International Conference on Computer Vision, pp. 2334–2343. – reference: Tang, W., Yu, P., Wu, Y., 2018a. Deeply learned compositional models for human pose estimation. In: Proc. European Conference on Computer Vision, pp. 190–206. – year: 2017 ident: b175 article-title: Ai challenger: A large-scale dataset for going deeper in image understanding – start-page: 34 year: 2016 end-page: 50 ident: b58 article-title: Deepercut: A deeper, stronger, and faster multi-person pose estimation model publication-title: Proc. European Conference on Computer Vision – volume: 152 start-page: 1 year: 2016 end-page: 20 ident: b145 article-title: 3d human pose estimation: A review of the literature and analysis of covariates publication-title: Comput. Vis. Image Underst. – reference: Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B., 2016. Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937. – reference: Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J., 2018. End-to-end recovery of human shape and pose. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 7122–7131. – reference: Chu, X., Ouyang, W., Li, H., Wang, X., 2016. Structured feature learning for pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 4715–4723. – year: 2018 ident: b116 article-title: Numerical coordinate regression with convolutional neural networks – reference: Wang, Y., Tran, D., Liao, Z., 2011. Learning hierarchical poselets for human parsing. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1705–1712. – start-page: 907 year: 2014 end-page: 913 ident: b36 article-title: A monocular pose estimation system based on infrared leds publication-title: Proc. IEEE International Conference on Robotics and Automation – reference: Li, B., Shen, C., Dai, Y., Hengel, A., He, M., 2015a. Depth and surface normal estimation from monocular images using regression on deep features and hierarchical crfs. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1119–1127. – volume: 40 start-page: 13 year: 2010 end-page: 24 ident: b67 article-title: Advances in view-invariant human motion analysis: A review publication-title: IEEE Trans. Syst. Man Cybern. Part C – reference: Moreno-Noguer, F., 2017. 3d human pose estimation from a single image via distance matrix regression. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1561–1570. – reference: Tompson, J., Goroshin, R., Jain, A., LeCun, Y., Bregler, C., 2015. Efficient object localization using convolutional networks. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 648–656. – volume: 34 start-page: 120 year: 2015 ident: b132 article-title: Dyna: A model of dynamic human shape in motion publication-title: ACM Trans. Graph. – start-page: 2017 year: 2015 end-page: 2025 ident: b62 article-title: Spatial transformer networks publication-title: Advances in Neural Information Processing Systems – year: 2019 ident: b100 article-title: Amass: Archive of motion capture as surface shapes – reference: Ju, S.X., Black, M.J., Yacoob, Y., 1996. Cardboard people: A parameterized model of articulated image motion. In: Proc. IEEE Conference on Automatic Face and Gesture Recognition, pp. 38–44. – reference: Trumble, M., Gilbert, A., Malleson, C., Hilton, A., Collomosse, J., 2017. Total capture: 3d human pose estimation fusing video and inertial sensors. In: Proc. British Machine Vision Conference, pp. 1–13. – start-page: 186 year: 2016 end-page: 201 ident: b185 article-title: Deep kinematic pose regression publication-title: Proc. European Conference on Computer Vision – start-page: 717 year: 2016 end-page: 732 ident: b12 article-title: Human pose estimation via convolutional part heatmap regression publication-title: Proc. European Conference on Computer Vision – reference: Johnson, S., Everingham, M., 2010. Clustered pose and nonlinear appearance models for human pose estimation. In: Proc. British Machine Vision Conference, p. 5. – reference: Li, B., Chen, H., Chen, Y., Dai, Y., He, M., 2017a. Skeleton boxes: Solving skeleton based action detection with a single deep convolutional neural network. In: Proc. IEEE International Conference on Multimedia and Expo Workshops, pp. 613–616. – reference: Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z., 2016. Rethinking the inception architecture for computer vision. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826. – year: 2013 ident: b63 article-title: Learning human pose estimation features with convolutional networks – volume: 40 start-page: 33 year: 1975 end-page: 51 ident: b48 article-title: Generalized procrustes analysis publication-title: Psychometrika – start-page: 185 year: 2008 end-page: 211 ident: b150 article-title: 3d human motion analysis in monocular video: techniques and challenges publication-title: Human Motion – reference: Qammaz, A., Argyros, A., 2019. Mocapnet: Ensemble of snn encoders for 3d human pose estimation in rgb images. In: Proc. British Machine VIsion Conference. – start-page: 332 year: 2014 end-page: 347 ident: b80 article-title: 3d human pose estimation from monocular images with deep convolutional neural network publication-title: Proc. Asian Conference on Computer Vision – start-page: 740 year: 2014 end-page: 755 ident: b93 article-title: Microsoft coco: Common objects in context publication-title: Proc. European Conference on Computer Vision – reference: Tang, W., Wu, Y., 2019. Does learning specific features for related parts help human pose estimation?. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1107–1116. – reference: Iqbal, U., Milan, A., Gall, J., 2017. Posetrack: Joint multi-person pose estimation and tracking. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2011-2020. – reference: Zuffi, S., Freifeld, O., Black, M.J., 2012. From pictorial structures to deformable structures. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3546–3553. – reference: Pfister, T., Charles, J., Zisserman, A., 2015. Flowing convnets for human pose estimation in videos. In: Proc. IEEE International Conference on Computer Vision, pp. 1913–1921. – volume: 61 start-page: 38 year: 1995 end-page: 59 ident: b26 article-title: Active shape models-their training and application publication-title: Comput. Vis. Image Underst. – reference: Rhodin, H., Salzmann, M., Fua, P., 2018a. Unsupervised geometry-aware representation for 3d human pose estimation. In: Proc. European Conference on Computer Vision, pp. 750-767. – reference: Long, J., Shelhamer, E., Darrell, T., 2015. Fully convolutional networks for semantic segmentation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440. – reference: Zuffi, S., Black, M.J., 2015. The stitched puppet: A graphical model of 3d human shape and pose. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3537–3546. – reference: Joo, H., Simon, T., Sheikh, Y., 2018. Total capture: A 3d deformation model for tracking faces, hands, and bodies. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 8320–8329. – year: 2011 ident: b111 article-title: Visual Analysis of Humans – reference: Rafi, U., Leibe, B., Gall, J., Kostrikov, I., 2016. An efficient convolutional network for human pose estimation. In: Proc. British Machine Vision Conference, p. 2. – volume: 61 start-page: 55 year: 2005 end-page: 79 ident: b39 article-title: Pictorial structures for object recognition publication-title: Int. J. Compu. Vis. – reference: Li, B., Dai, Y., Cheng, X., Chen, H., Lin, Y., He, M., 2017b. Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep cnn. In: Proc. IEEE International Conference on Multimedia and Expo Workshops, pp. 601–604. – reference: Chen, C.H., Ramanan, D., 2017. 3d human pose estimation= 2d pose estimation+ matching. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 7035–7043. – volume: 38 start-page: 1533 year: 2016 end-page: 1547 ident: b102 article-title: Human pose estimation from video and imus publication-title: IEEE transactions on pattern analysis and machine intelligence – reference: Nie, X., Feng, J., Xing, J., Yan, S., 2018. Pose partition networks for multi-person pose estimation. In: Proc. European Conference on Computer Vision, pp. 684–699. – reference: Pavlakos, G., Zhou, X., Daniilidis, K., 2018a. Ordinal depth supervision for 3d human pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 7307-7316. – reference: von Marcard, T., Henschel, R., Black, M.J., Rosenhahn, B., Pons-Moll, G., 2018. Recovering accurate 3d human pose in the wild using imus and a moving camera. In: Proc. European Conference on Computer Vision, pp. 601–617. – reference: Moon, G., Chang, J.Y., Lee, K.M., 2019. Posefix: Model-agnostic general human pose refinement network. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 7773–7781. – reference: Fan, X., Zheng, K., Lin, Y., Wang, S., 2015. Combining local appearance and holistic view: Dual-source deep neural networks for human pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1347-1355. – start-page: 1365 year: 2009 end-page: 1372 ident: b11 article-title: Poselets: Body part detectors trained using 3d human pose annotations publication-title: Proc. IEEE International Conference on Computer Vision – reference: Lassner, C., Romero, J., Kiefel, M., Bogo, F., Black, M.J., Gehler, P.V., 2017. Unite the people: Closing the loop between 3d and 2d human representations. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 4704–4713. – volume: 34 start-page: 2282 year: 2012 end-page: 2288 ident: b31 article-title: Human pose co-estimation and applications publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – year: 2012 ident: b30 article-title: Calvin upper-body detector v1.04 – reference: Chen, Y., Shen, C., Wei, X.S., Liu, L., Yang, J., 2017. Adversarial posenet: A structure-aware convolutional network for human pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1212-1221. – volume: 88 start-page: 303 year: 2010 end-page: 338 ident: b34 article-title: The pascal visual object classes (voc) challenge publication-title: Int. J. Comput. Vis. – start-page: 246 year: 2016 end-page: 260 ident: b91 article-title: Human pose estimation using deep consensus voting publication-title: Proc. European Conference on Computer Vision – reference: Tekin, B., Marque. Neila, P., Salzmann, M., Fua, P., 2017. Learning to fuse 2d and 3d image cues for monocular body pose estimation. In: Proc. IEEE International Conference on Computer Vision, pp. 3941–3950. – start-page: 484 year: 2018 end-page: 494 ident: b120 article-title: Neural body fitting: Unifying deep learning and model based human pose and shape estimation publication-title: Proc. IEEE International Conference on 3D Vision – reference: Rhodin, H., Spörri, I., Constantin, V., Meyer, F., Müller, E., Salzmann, M., Fua, P., 2018b. Learning monocular 3d human pose estimation from multi-view images. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 8437–8446. – volume: 6 start-page: 538 year: 2012 end-page: 552 ident: b52 article-title: Human pose estimation and activity recognition from multi-view videos: Comparative explorations of recent developments publication-title: IEEE J. Sel. Top. Signal Process. – volume: 101 start-page: 184 year: 2013 end-page: 204 ident: b170 article-title: Efficiently scaling up crowdsourced video annotation publication-title: Int. J. Comput. Vis. – reference: Bogo, F., Romero, J., Pons-Moll, G., Black, M.J., 2017. Dynamic FAUST: Registering human bodies in motion. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 6233–6242. – start-page: 627 year: 2016 end-page: 642 ident: b60 article-title: Multi-person pose estimation with local joint-to-person associations publication-title: Proc. European Conference on Computer Vision – volume: 85 start-page: 15 year: 2019 end-page: 22 ident: b99 article-title: Human pose regression by combining indirect part detection and contextual information publication-title: Comput. Graph. – start-page: 1736 year: 2014 end-page: 1744 ident: b22 article-title: Articulated pose estimation by a graphical model with image dependent pairwise relations publication-title: Advances in Neural Information Processing Systems – reference: Fabbri, M., Lanzi, F., Calderara, S., Palazzi, A., Vezzani, R., Cucchiara, R., 2018. Learning to detect and track visible and occluded body joints in a virtual world. In: Proc. European Conference on Computer Vision, pp. 430–446. – year: 2019 ident: b56 article-title: INRIA4D – volume: 81 start-page: 231 year: 2001 end-page: 268 ident: b109 article-title: A survey of computer vision-based human motion capture publication-title: Comput. Vis. Image Underst. – reference: Peng, X., Tang, Z., Yang, F., Feris, R.S., Metaxas, D., 2018. Jointly optimize data augmentation and network training: Adversarial data augmentation in human pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2226–2234. – reference: Sapp, B., Weiss, D., Taskar, B., 2011. Parsing human motion with stretchable models. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1281–1288. – reference: Ouyang, W., Chu, X., Wang, X., 2014. Multi-source deep learning for human pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2329–2336. – reference: Toshev, A., Szegedy, C., 2014. Deeppose: Human pose estimation via deep neural networks. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1653–1660. – reference: Dantone, M., Gall, J., Leistner, C., Va. Gool, L., 2013. Human pose estimation using body parts dependent joint regressors. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3041–3048. – reference: Jhuang, H., Garrote, H., Poggio, E., Serre, T., Hmdb, T., 2011. A large video database for human motion recognition. In: Proc. IEEE International Conference on Computer Vision, p. 6. – start-page: 337 year: 2009 end-page: 346 ident: b50 article-title: A statistical model of human pose and body shape publication-title: Computer Graphics Forum – start-page: 33 year: 2014 end-page: 47 ident: b137 article-title: Pose machines: Articulated pose estimation via inference machines publication-title: Proc. European Conference on Computer Vision – reference: Ferrari, V., Marin-Jimenez, M., Zisserman, A., 2008. Progressive search space reduction for human pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. – volume: 20 start-page: 1246 year: 2018 end-page: 1259 ident: b119 article-title: Knowledge-guided deep fractal neural networks for human pose estimation publication-title: IEEE Trans. Multimed. – start-page: 479 year: 2016 end-page: 488 ident: b19 article-title: Synthesizing training images for boosting human 3d pose estimation publication-title: Proc. IEEE International Conference on 3D Vision – reference: Sapp, B., Taskar, B., 2013. Modec: Multimodal decomposable models for human pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3674–3681. – volume: 43 start-page: 1575 year: 2011 end-page: 1581 ident: b2 article-title: 2011 compendium of physical activities: a second update of codes and met values publication-title: Med. Sci. Sports Exerc. – reference: Li, S., Liu, Z.Q., Chan, A.B., 2014. Heterogeneous multi-task learning for human pose estimation with deep convolutional neural network. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 482–489. – volume: 77 start-page: 22901 year: 2018 end-page: 22921 ident: b86 article-title: 3d skeleton based action recognition by video-domain translation-scale invariant mapping and multi-scale dilated cnn publication-title: Multimedia Tools Appl. – start-page: 1097 year: 2012 end-page: 1105 ident: b78 article-title: Imagenet classification with deep convolutional neural networks publication-title: Advances in Neural Information Processing Systems – volume: 35 start-page: 2878 year: 2013 end-page: 2890 ident: b180 article-title: Articulated human detection with flexible mixtures of parts publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – reference: Tang, Z., Peng, X., Geng, S., Wu, L., Zhang, S., Metaxas, D., 2018b. Quantized densely connected u-nets for efficient landmark localization. In: Proc. European Conference on Computer Vision, pp. 339–354. – reference: Huang, S., Gong, M., Tao, D., 2017. A coarse-fine network for keypoint localization. In: Proc. IEEE International Conference on Computer Vision, pp. 3028–3037. – reference: Bogo, F., Romero, J., Loper, M., Black, M.J., 2014. FAUST: Dataset and evaluation for 3D mesh registration. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3794–3801. – reference: Gkioxari, G., Arbelaez, P., Bourdev, L., Malik, J., 2013. Articulated pose estimation using discriminative armlet classifiers. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3342–3349. – reference: Varol, G., Ceylan, D., Russell, B., Yang, J., Yumer, E., Laptev, I., Schmid, C., 2018. Bodynet: Volumetric inference of 3d human body shapes. In: Proc. European Conference on Computer Vision, pp. 20-36. – start-page: 728 year: 2016 end-page: 743 ident: b46 article-title: Chained predictions using convolutional neural networks publication-title: Proc. European Conference on Computer Vision – volume: 34 start-page: 248 year: 2015 ident: b96 article-title: Smpl: A skinned multi-person linear model publication-title: ACM Trans. Graph. – volume: 32 start-page: 10 year: 2015 end-page: 19 ident: b94 article-title: A survey of human pose estimation: the body parts parsing based methods publication-title: J. Vis. Commun. Image Represent. – reference: Mehta, D., Sotnychenko, O., Mueller, F., Xu, W., Sridhar, S., Pons-Moll, G., Theobalt, C., 2018. Single-shot multi-person 3d body pose estimation from monocular rgb input. In: International Conference on 3D Vision, pp. 120-130. – reference: Carreira, J., Agrawal, P., Fragkiadaki, K., Malik, J., 2016. Human pose estimation with iterative error feedback. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 4733–4742. – reference: Rohrbach, M., Amin, S., Andriluka, M., Schiele, B., 2012. A database for fine grained activity detection of cooking activities. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1194–1201. – volume: 34 start-page: 1995 year: 2013 end-page: 2006 ident: b21 article-title: A survey of human motion analysis using depth imagery publication-title: Pattern Recognit. Lett. – start-page: 483 year: 2016 end-page: 499 ident: b115 article-title: Stacked hourglass networks for human pose estimation publication-title: Proc. European Conference on Computer Vision – volume: 73 start-page: 428 year: 1999 end-page: 440 ident: b1 article-title: Human motion analysis: A review publication-title: Comput. Vis. Image Underst. – volume: 14 start-page: 4189 year: 2014 end-page: 4210 ident: b128 article-title: A survey on model based approaches for 2d and 3d visual human pose recovery publication-title: Sensors – reference: Chen, Y., Wang, Z., Peng, Y., Zhang, Z., Yu, G., Sun, J., 2018. Cascaded pyramid network for multi-person pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 7103–7112. – reference: Pavlakos, G., Zhou, X., Derpanis, K.G., Daniilidis, K., 2017. Coarse-to-fine volumetric prediction for single-image 3d human pose. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1263–1272. – reference: Chu, X., Yang, W., Ouyang, W., Ma, C., Yuille, A.L., Wang, X., 2017. Multi-context attention for human pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1831-1840. – reference: Yang, W., Ouyang, W., Wang, X., Ren, J., Li, H., Wang, X., 2018. 3d human pose estimation in the wild by adversarial learning. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 5255–5264. – reference: Jhuang, H., Gall, J., Zuffi, S., Schmid, C., Black, M.J., 2013. Towards understanding action recognition. In: Proc. IEEE International Conference on Computer Vision, pp. 3192–3199. – reference: Papandreou, G., Zhu, T., Kanazawa, N., Toshev, A., Tompson, J., Bregler, C., Murphy, K., 2017. Towards accurate multi-person pose estimation in the wild. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 4903–4911. – reference: Martinez, J., Hossain, R., Romero, J., Little, J.J., 2017. A simple yet effective baseline for 3d human pose estimation. In: Proc. IEEE International Conference on Computer Vision, pp. 2640–2649. – reference: Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y., 2016. Convolutional pose machines. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 4724–4732. – volume: 104 start-page: 90 year: 2006 end-page: 126 ident: b110 article-title: A survey of advances in vision-based human motion capture and analysis publication-title: Comput. Vis. Image Underst. – year: 2014 ident: b44 article-title: R-cnns for pose estimation and action detection – reference: Li, S., Zhang, W., Chan, A.B., 2015b. Maximum-margin structured learning with deep networks for 3d human pose estimation. In: Proc. IEEE International Conference on Computer Vision, pp. 2848–2856. – reference: Sun, K., Xiao, B., Liu, D., Wang, J., 2019. Deep high-resolution representation learning for human pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition. – reference: Luo, Y., Ren, J., Wang, Z., Sun, W., Pan, J., Liu, J., Pang, J., Lin, L., 2018. Lstm pose machines. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 5207–5215. – start-page: 538 year: 2014 end-page: 552 ident: b130 article-title: Deep convolutional neural networks for efficient pose estimation in gesture videos publication-title: Proc. Asian Conference on Computer Vision – volume: 110 start-page: 70 year: 2014 end-page: 90 ident: b15 article-title: Automatic and efficient human pose estimation for sign language videos publication-title: Int. J. Comput. Vis. – start-page: 408 year: 2005 end-page: 416 ident: b5 article-title: Scape: shape completion and animation of people publication-title: ACM Transactions on Graphics – year: 2019 ident: b75 article-title: Kinect – reference: Andriluka, M., Iqbal, U., Milan, A., Insafutdinov, E., Pishchulin, L., Gall, J., Schiele, B., 2018. Posetrack: A benchmark for human pose estimation and tracking. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 5167–5176. – start-page: 506 year: 2017 end-page: 516 ident: b104 article-title: Monocular 3d human pose estimation in the wild using improved cnn supervision publication-title: Proc. IEEE International Conference on 3D Vision – start-page: 241 year: 2001 end-page: 244 ident: b108 article-title: Motion Capture File Formats Explained, Vol. 211 – volume: 108 start-page: 4 year: 2007 end-page: 18 ident: b134 article-title: Vision-based human motion analysis: An overview publication-title: Comput. Vis. Image Underst. – volume: 34 start-page: 334 year: 2004 end-page: 352 ident: b54 article-title: A survey on visual surveillance of object motion and behaviors publication-title: IEEE Trans. Syst. Man Cybern. Part C – year: 2019 ident: b161 article-title: TheCaptury – reference: Sun, X., Xiao, B., Wei, F., Liang, S., Wei, Y., 2018. Integral human pose regression. In: Proc. European Conference on Computer Vision, pp. 529–545. – reference: Li, L., Fei-fei, L., 2007. What, where and who? classifying events by scene and object recognition. In: Proc. IEEE International Conference on Computer Vision, p. 6. – reference: Zhou, X., Huang, Q., Sun, X., Xue, X., Wei, Y., 2017. Towards 3d human pose estimation in the wild: a weakly-supervised approach. In: Proc. IEEE International Conference on Computer Vision, pp. 398–407. – ident: 10.1016/j.cviu.2019.102897_b13 doi: 10.1109/CVPR.2017.143 – ident: 10.1016/j.cviu.2019.102897_b57 doi: 10.1109/CVPR.2017.142 – start-page: 2277 year: 2017 ident: 10.1016/j.cviu.2019.102897_b114 article-title: Associative embedding: End-to-end learning for joint detection and grouping – volume: 41 start-page: 190 year: 2017 ident: 10.1016/j.cviu.2019.102897_b70 article-title: Panoptic studio: A massively multiview system for social interaction capture publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2017.2782743 – volume: 38 start-page: 1533 year: 2016 ident: 10.1016/j.cviu.2019.102897_b102 article-title: Human pose estimation from video and imus publication-title: IEEE transactions on pattern analysis and machine intelligence doi: 10.1109/TPAMI.2016.2522398 – volume: 77 start-page: 22901 year: 2018 ident: 10.1016/j.cviu.2019.102897_b86 article-title: 3d skeleton based action recognition by video-domain translation-scale invariant mapping and multi-scale dilated cnn publication-title: Multimedia Tools Appl. doi: 10.1007/s11042-018-5642-0 – ident: 10.1016/j.cviu.2019.102897_b135 – ident: 10.1016/j.cviu.2019.102897_b126 doi: 10.1109/CVPR.2018.00055 – start-page: 1736 year: 2014 ident: 10.1016/j.cviu.2019.102897_b22 article-title: Articulated pose estimation by a graphical model with image dependent pairwise relations – volume: 34 start-page: 1995 year: 2013 ident: 10.1016/j.cviu.2019.102897_b21 article-title: A survey of human motion analysis using depth imagery publication-title: Pattern Recognit. Lett. doi: 10.1016/j.patrec.2013.02.006 – ident: 10.1016/j.cviu.2019.102897_b129 doi: 10.1109/ICCV.2015.222 – ident: 10.1016/j.cviu.2019.102897_b122 doi: 10.1007/978-3-030-01264-9_17 – start-page: 246 year: 2016 ident: 10.1016/j.cviu.2019.102897_b91 article-title: Human pose estimation using deep consensus voting – ident: 10.1016/j.cviu.2019.102897_b92 doi: 10.1109/CVPR.2017.106 – ident: 10.1016/j.cviu.2019.102897_b133 doi: 10.1109/CVPR.2017.501 – volume: 104 start-page: 90 year: 2006 ident: 10.1016/j.cviu.2019.102897_b110 article-title: A survey of advances in vision-based human motion capture and analysis publication-title: Comput. Vis. Image Underst. doi: 10.1016/j.cviu.2006.08.002 – ident: 10.1016/j.cviu.2019.102897_b45 doi: 10.1109/CVPR.2014.458 – ident: 10.1016/j.cviu.2019.102897_b101 doi: 10.1007/978-3-030-01249-6_37 – start-page: 483 year: 2016 ident: 10.1016/j.cviu.2019.102897_b115 article-title: Stacked hourglass networks for human pose estimation – volume: 20 start-page: 1246 year: 2018 ident: 10.1016/j.cviu.2019.102897_b119 article-title: Knowledge-guided deep fractal neural networks for human pose estimation publication-title: IEEE Trans. Multimed. doi: 10.1109/TMM.2017.2762010 – ident: 10.1016/j.cviu.2019.102897_b187 doi: 10.1109/CVPR.2012.6248098 – volume: 85 start-page: 15 year: 2019 ident: 10.1016/j.cviu.2019.102897_b99 article-title: Human pose regression by combining indirect part detection and contextual information publication-title: Comput. Graph. doi: 10.1016/j.cag.2019.09.002 – ident: 10.1016/j.cviu.2019.102897_b178 doi: 10.1109/CVPR.2016.335 – ident: 10.1016/j.cviu.2019.102897_b73 doi: 10.1109/CVPR.2018.00744 – ident: 10.1016/j.cviu.2019.102897_b177 doi: 10.1109/ICCV.2017.144 – volume: 81 start-page: 231 year: 2001 ident: 10.1016/j.cviu.2019.102897_b109 article-title: A survey of computer vision-based human motion capture publication-title: Comput. Vis. Image Underst. doi: 10.1006/cviu.2000.0897 – ident: 10.1016/j.cviu.2019.102897_b18 doi: 10.1109/ICCV.2017.137 – start-page: 185 year: 2008 ident: 10.1016/j.cviu.2019.102897_b150 article-title: 3d human motion analysis in monocular video: techniques and challenges – volume: 34 start-page: 334 year: 2004 ident: 10.1016/j.cviu.2019.102897_b54 article-title: A survey on visual surveillance of object motion and behaviors publication-title: IEEE Trans. Syst. Man Cybern. Part C doi: 10.1109/TSMCC.2004.829274 – ident: 10.1016/j.cviu.2019.102897_b124 doi: 10.1109/CVPR.2018.00763 – start-page: 91 year: 2015 ident: 10.1016/j.cviu.2019.102897_b138 article-title: Faster r-cnn: Towards real-time object detection with region proposal networks – ident: 10.1016/j.cviu.2019.102897_b32 doi: 10.5244/C.23.3 – ident: 10.1016/j.cviu.2019.102897_b20 doi: 10.1109/CVPR.2018.00742 – volume: 40 start-page: 13 year: 2010 ident: 10.1016/j.cviu.2019.102897_b67 article-title: Advances in view-invariant human motion analysis: A review publication-title: IEEE Trans. Syst. Man Cybern. Part C doi: 10.1109/TSMCC.2009.2027608 – ident: 10.1016/j.cviu.2019.102897_b121 doi: 10.1109/CVPR.2014.299 – ident: 10.1016/j.cviu.2019.102897_b79 doi: 10.1109/CVPR.2017.500 – ident: 10.1016/j.cviu.2019.102897_b9 doi: 10.1109/CVPR.2014.491 – ident: 10.1016/j.cviu.2019.102897_b16 doi: 10.1109/CVPR.2016.334 – ident: 10.1016/j.cviu.2019.102897_b68 doi: 10.5244/C.24.12 – year: 2014 ident: 10.1016/j.cviu.2019.102897_b44 – ident: 10.1016/j.cviu.2019.102897_b163 doi: 10.1109/CVPR.2015.7298664 – ident: 10.1016/j.cviu.2019.102897_b82 – volume: 6 start-page: 538 year: 2012 ident: 10.1016/j.cviu.2019.102897_b52 article-title: Human pose estimation and activity recognition from multi-view videos: Comparative explorations of recent developments publication-title: IEEE J. Sel. Top. Signal Process. doi: 10.1109/JSTSP.2012.2196975 – ident: 10.1016/j.cviu.2019.102897_b153 doi: 10.1109/ICCV.2017.284 – year: 2011 ident: 10.1016/j.cviu.2019.102897_b111 – year: 2019 ident: 10.1016/j.cviu.2019.102897_b75 – start-page: 34 year: 2016 ident: 10.1016/j.cviu.2019.102897_b58 article-title: Deepercut: A deeper, stronger, and faster multi-person pose estimation model – volume: 73 start-page: 82 year: 1999 ident: 10.1016/j.cviu.2019.102897_b42 article-title: The visual analysis of human movement: A survey publication-title: Comput. Vis. Image Underst. doi: 10.1006/cviu.1998.0716 – ident: 10.1016/j.cviu.2019.102897_b131 doi: 10.1109/CVPR.2016.533 – ident: 10.1016/j.cviu.2019.102897_b144 doi: 10.1109/CVPR.2011.5995607 – ident: 10.1016/j.cviu.2019.102897_b90 doi: 10.1109/ICCV.2015.326 – ident: 10.1016/j.cviu.2019.102897_b186 doi: 10.1109/CVPR.2015.7298976 – start-page: 907 year: 2014 ident: 10.1016/j.cviu.2019.102897_b36 article-title: A monocular pose estimation system based on infrared leds – year: 2019 ident: 10.1016/j.cviu.2019.102897_b169 – ident: 10.1016/j.cviu.2019.102897_b85 doi: 10.1109/ICCV.2007.4408872 – year: 2018 ident: 10.1016/j.cviu.2019.102897_b171 – year: 2019 ident: 10.1016/j.cviu.2019.102897_b105 – ident: 10.1016/j.cviu.2019.102897_b176 doi: 10.1007/978-3-030-01231-1_29 – start-page: 228 year: 2010 ident: 10.1016/j.cviu.2019.102897_b29 article-title: We are family: Joint pose estimation of multiple persons – year: 2012 ident: 10.1016/j.cviu.2019.102897_b30 – volume: 61 start-page: 38 year: 1995 ident: 10.1016/j.cviu.2019.102897_b26 article-title: Active shape models-their training and application publication-title: Comput. Vis. Image Underst. doi: 10.1006/cviu.1995.1004 – ident: 10.1016/j.cviu.2019.102897_b38 doi: 10.1109/ICCV.2017.256 – ident: 10.1016/j.cviu.2019.102897_b136 doi: 10.5244/C.30.109 – ident: 10.1016/j.cviu.2019.102897_b179 doi: 10.1109/CVPR.2018.00551 – ident: 10.1016/j.cviu.2019.102897_b182 doi: 10.1109/ICCV.2013.280 – ident: 10.1016/j.cviu.2019.102897_b84 doi: 10.1109/CVPR.2019.00465 – ident: 10.1016/j.cviu.2019.102897_b17 doi: 10.1109/CVPR.2017.610 – start-page: 479 year: 2016 ident: 10.1016/j.cviu.2019.102897_b19 article-title: Synthesizing training images for boosting human 3d pose estimation – ident: 10.1016/j.cviu.2019.102897_b125 doi: 10.1109/CVPR.2017.139 – ident: 10.1016/j.cviu.2019.102897_b113 doi: 10.1109/CVPR.2017.170 – start-page: 2017 year: 2015 ident: 10.1016/j.cviu.2019.102897_b62 article-title: Spatial transformer networks – year: 2019 ident: 10.1016/j.cviu.2019.102897_b56 – ident: 10.1016/j.cviu.2019.102897_b139 doi: 10.1007/978-3-030-01249-6_46 – ident: 10.1016/j.cviu.2019.102897_b65 doi: 10.1109/ICCV.2013.396 – start-page: 1799 year: 2014 ident: 10.1016/j.cviu.2019.102897_b164 article-title: Joint training of a convolutional network and a graphical model for human pose estimation – ident: 10.1016/j.cviu.2019.102897_b181 doi: 10.1109/CVPR.2018.00229 – ident: 10.1016/j.cviu.2019.102897_b183 doi: 10.1109/CVPR.2018.00768 – start-page: 332 year: 2014 ident: 10.1016/j.cviu.2019.102897_b80 article-title: 3d human pose estimation from monocular images with deep convolutional neural network – ident: 10.1016/j.cviu.2019.102897_b184 doi: 10.1109/ICCV.2017.51 – volume: 36 start-page: 44 year: 2017 ident: 10.1016/j.cviu.2019.102897_b107 article-title: Vnect: Real-time 3d human pose estimation with a single rgb camera publication-title: ACM Trans. Graph. doi: 10.1145/3072959.3073596 – ident: 10.1016/j.cviu.2019.102897_b10 doi: 10.1109/CVPR.2017.591 – volume: 83 start-page: 328 year: 2018 ident: 10.1016/j.cviu.2019.102897_b83 article-title: Monocular depth estimation with hierarchical fusion of dilated cnns and soft-weighted-sum inference publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2018.05.029 – ident: 10.1016/j.cviu.2019.102897_b173 doi: 10.1109/CVPR.2011.5995519 – volume: 36 start-page: 1325 year: 2014 ident: 10.1016/j.cviu.2019.102897_b59 article-title: Human3.6m: Large scale datasets and predictive methods for 3d human sensing in natural environments publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2013.248 – ident: 10.1016/j.cviu.2019.102897_b89 – volume: 101 start-page: 184 year: 2013 ident: 10.1016/j.cviu.2019.102897_b170 article-title: Efficiently scaling up crowdsourced video annotation publication-title: Int. J. Comput. Vis. doi: 10.1007/s11263-012-0564-1 – ident: 10.1016/j.cviu.2019.102897_b162 doi: 10.1109/CVPR.2017.603 – ident: 10.1016/j.cviu.2019.102897_b28 doi: 10.1109/AVSS.2018.8639378 – start-page: 506 year: 2017 ident: 10.1016/j.cviu.2019.102897_b104 article-title: Monocular 3d human pose estimation in the wild using improved cnn supervision – ident: 10.1016/j.cviu.2019.102897_b148 – ident: 10.1016/j.cviu.2019.102897_b143 doi: 10.1109/CVPR.2013.471 – ident: 10.1016/j.cviu.2019.102897_b98 doi: 10.1109/CVPR.2018.00539 – ident: 10.1016/j.cviu.2019.102897_b112 doi: 10.1109/CVPR.2019.00796 – ident: 10.1016/j.cviu.2019.102897_b146 doi: 10.1109/CVPR.2016.115 – ident: 10.1016/j.cviu.2019.102897_b37 – volume: 14 start-page: 4189 year: 2014 ident: 10.1016/j.cviu.2019.102897_b128 article-title: A survey on model based approaches for 2d and 3d visual human pose recovery publication-title: Sensors doi: 10.3390/s140304189 – start-page: 538 year: 2014 ident: 10.1016/j.cviu.2019.102897_b130 article-title: Deep convolutional neural networks for efficient pose estimation in gesture videos – volume: 39 start-page: 501 year: 2017 ident: 10.1016/j.cviu.2019.102897_b33 article-title: Marconi—convnet-based marker-less motion capture in outdoor and indoor scenes publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2016.2557779 – start-page: 337 year: 2009 ident: 10.1016/j.cviu.2019.102897_b50 article-title: A statistical model of human pose and body shape – year: 2017 ident: 10.1016/j.cviu.2019.102897_b53 – ident: 10.1016/j.cviu.2019.102897_b72 – ident: 10.1016/j.cviu.2019.102897_b168 doi: 10.1109/CVPR.2017.492 – start-page: 728 year: 2016 ident: 10.1016/j.cviu.2019.102897_b46 article-title: Chained predictions using convolutional neural networks – start-page: 437 year: 2018 ident: 10.1016/j.cviu.2019.102897_b76 article-title: Multiposenet: Fast multi-person pose estimation using pose residual network – ident: 10.1016/j.cviu.2019.102897_b88 doi: 10.1109/CVPRW.2014.78 – year: 2018 ident: 10.1016/j.cviu.2019.102897_b116 – ident: 10.1016/j.cviu.2019.102897_b6 doi: 10.1109/CVPR.2019.00351 – ident: 10.1016/j.cviu.2019.102897_b41 doi: 10.1109/CVPR.2008.4587468 – ident: 10.1016/j.cviu.2019.102897_b155 doi: 10.5244/C.31.15 – start-page: 627 year: 2016 ident: 10.1016/j.cviu.2019.102897_b60 article-title: Multi-person pose estimation with local joint-to-person associations – volume: 34 start-page: 248 year: 2015 ident: 10.1016/j.cviu.2019.102897_b96 article-title: Smpl: A skinned multi-person linear model publication-title: ACM Trans. Graph. doi: 10.1145/2816795.2818013 – volume: 43 start-page: 1575 year: 2011 ident: 10.1016/j.cviu.2019.102897_b2 article-title: 2011 compendium of physical activities: a second update of codes and met values publication-title: Med. Sci. Sports Exerc. doi: 10.1249/MSS.0b013e31821ece12 – year: 2019 ident: 10.1016/j.cviu.2019.102897_b161 – ident: 10.1016/j.cviu.2019.102897_b55 doi: 10.1109/ICCV.2017.329 – ident: 10.1016/j.cviu.2019.102897_b66 – ident: 10.1016/j.cviu.2019.102897_b81 – start-page: 740 year: 2014 ident: 10.1016/j.cviu.2019.102897_b93 article-title: Microsoft coco: Common objects in context – volume: 32 start-page: 10 year: 2015 ident: 10.1016/j.cviu.2019.102897_b94 article-title: A survey of human pose estimation: the body parts parsing based methods publication-title: J. Vis. Commun. Image Represent. doi: 10.1016/j.jvcir.2015.06.013 – ident: 10.1016/j.cviu.2019.102897_b40 – start-page: 186 year: 2016 ident: 10.1016/j.cviu.2019.102897_b185 article-title: Deep kinematic pose regression – start-page: 468 year: 2017 ident: 10.1016/j.cviu.2019.102897_b7 article-title: Recurrent human pose estimation – ident: 10.1016/j.cviu.2019.102897_b71 doi: 10.1109/CVPR.2018.00868 – ident: 10.1016/j.cviu.2019.102897_b158 doi: 10.1007/978-3-030-01219-9_12 – ident: 10.1016/j.cviu.2019.102897_b151 doi: 10.1109/ICCV.2017.284 – ident: 10.1016/j.cviu.2019.102897_b118 doi: 10.1109/ICCV.2017.373 – ident: 10.1016/j.cviu.2019.102897_b167 doi: 10.1007/978-3-030-01234-2_2 – year: 2017 ident: 10.1016/j.cviu.2019.102897_b175 – volume: 16 start-page: 1966 year: 2016 ident: 10.1016/j.cviu.2019.102897_b47 article-title: Human pose estimation from monocular images: A comprehensive survey publication-title: Sensors doi: 10.3390/s16121966 – ident: 10.1016/j.cviu.2019.102897_b23 doi: 10.23919/APSIPA.2018.8659538 – ident: 10.1016/j.cviu.2019.102897_b160 doi: 10.1109/ICCV.2017.425 – volume: 61 start-page: 55 year: 2005 ident: 10.1016/j.cviu.2019.102897_b39 article-title: Pictorial structures for object recognition publication-title: Int. J. Compu. Vis. doi: 10.1023/B:VISI.0000042934.15159.49 – ident: 10.1016/j.cviu.2019.102897_b127 doi: 10.1109/CVPR.2018.00237 – start-page: 408 year: 2005 ident: 10.1016/j.cviu.2019.102897_b5 article-title: Scape: shape completion and animation of people – ident: 10.1016/j.cviu.2019.102897_b117 doi: 10.1007/978-3-030-01228-1_42 – ident: 10.1016/j.cviu.2019.102897_b103 doi: 10.1109/ICCV.2017.288 – start-page: 1365 year: 2009 ident: 10.1016/j.cviu.2019.102897_b11 article-title: Poselets: Body part detectors trained using 3d human pose annotations – ident: 10.1016/j.cviu.2019.102897_b3 doi: 10.1109/CVPR.2018.00542 – volume: 88 start-page: 303 year: 2010 ident: 10.1016/j.cviu.2019.102897_b34 article-title: The pascal visual object classes (voc) challenge publication-title: Int. J. Comput. Vis. doi: 10.1007/s11263-009-0275-4 – ident: 10.1016/j.cviu.2019.102897_b49 doi: 10.1109/CVPR.2018.00762 – ident: 10.1016/j.cviu.2019.102897_b69 doi: 10.1109/CVPR.2011.5995318 – ident: 10.1016/j.cviu.2019.102897_b74 doi: 10.1109/ICIP.2018.8451114 – ident: 10.1016/j.cviu.2019.102897_b95 doi: 10.1109/CVPR.2015.7298965 – start-page: 561 year: 2016 ident: 10.1016/j.cviu.2019.102897_b8 article-title: Keep it smpl: Automatic estimation of 3d human pose and shape from a single image – volume: 108 start-page: 4 year: 2007 ident: 10.1016/j.cviu.2019.102897_b134 article-title: Vision-based human motion analysis: An overview publication-title: Comput. Vis. Image Underst. doi: 10.1016/j.cviu.2006.10.016 – ident: 10.1016/j.cviu.2019.102897_b166 doi: 10.5244/C.31.14 – ident: 10.1016/j.cviu.2019.102897_b27 doi: 10.1109/CVPR.2013.391 – volume: 35 start-page: 2821 year: 2012 ident: 10.1016/j.cviu.2019.102897_b147 article-title: Efficient human pose estimation from single depth images publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2012.241 – ident: 10.1016/j.cviu.2019.102897_b142 doi: 10.1109/CVPR.2012.6247801 – volume: 34 start-page: 120 year: 2015 ident: 10.1016/j.cviu.2019.102897_b132 article-title: Dyna: A model of dynamic human shape in motion publication-title: ACM Trans. Graph. doi: 10.1145/2766993 – start-page: 33 year: 2014 ident: 10.1016/j.cviu.2019.102897_b137 article-title: Pose machines: Articulated pose estimation via inference machines – ident: 10.1016/j.cviu.2019.102897_b157 doi: 10.1109/CVPR.2019.00120 – ident: 10.1016/j.cviu.2019.102897_b154 doi: 10.1109/CVPR.2016.308 – ident: 10.1016/j.cviu.2019.102897_b87 doi: 10.1109/CVPR.2019.01012 – start-page: 484 year: 2018 ident: 10.1016/j.cviu.2019.102897_b120 article-title: Neural body fitting: Unifying deep learning and model based human pose and shape estimation – ident: 10.1016/j.cviu.2019.102897_b174 doi: 10.1109/CVPR.2016.511 – volume: 87 start-page: 4 year: 2010 ident: 10.1016/j.cviu.2019.102897_b149 article-title: Humaneva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion publication-title: Int. J. Comput. Vis. doi: 10.1007/s11263-009-0273-6 – volume: 171 start-page: 118 year: 2018 ident: 10.1016/j.cviu.2019.102897_b172 article-title: Rgb-d-based human motion recognition with deep learning: A survey publication-title: Comput. Vis. Image Underst. doi: 10.1016/j.cviu.2018.04.007 – start-page: 2980 year: 2017 ident: 10.1016/j.cviu.2019.102897_b51 article-title: Mask r-cnn – ident: 10.1016/j.cviu.2019.102897_b4 doi: 10.1109/CVPR.2014.471 – year: 2019 ident: 10.1016/j.cviu.2019.102897_b100 – start-page: 717 year: 2016 ident: 10.1016/j.cviu.2019.102897_b12 article-title: Human pose estimation via convolutional part heatmap regression – ident: 10.1016/j.cviu.2019.102897_b140 doi: 10.1109/CVPR.2018.00880 – volume: 34 start-page: 2282 year: 2012 ident: 10.1016/j.cviu.2019.102897_b31 article-title: Human pose co-estimation and applications publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2012.85 – ident: 10.1016/j.cviu.2019.102897_b141 doi: 10.1109/CVPR.2017.134 – start-page: 302 year: 2014 ident: 10.1016/j.cviu.2019.102897_b64 article-title: Modeep: A deep learning framework using motion features for human pose estimation – ident: 10.1016/j.cviu.2019.102897_b152 doi: 10.1109/CVPR.2019.00584 – ident: 10.1016/j.cviu.2019.102897_b77 doi: 10.1109/CVPR.2019.01225 – volume: 73 start-page: 428 year: 1999 ident: 10.1016/j.cviu.2019.102897_b1 article-title: Human motion analysis: A review publication-title: Comput. Vis. Image Underst. doi: 10.1006/cviu.1998.0744 – ident: 10.1016/j.cviu.2019.102897_b14 doi: 10.1109/CVPR.2016.512 – volume: 40 start-page: 33 year: 1975 ident: 10.1016/j.cviu.2019.102897_b48 article-title: Generalized procrustes analysis publication-title: Psychometrika doi: 10.1007/BF02291478 – start-page: 1097 year: 2012 ident: 10.1016/j.cviu.2019.102897_b78 article-title: Imagenet classification with deep convolutional neural networks – year: 2013 ident: 10.1016/j.cviu.2019.102897_b63 – volume: 152 start-page: 1 year: 2016 ident: 10.1016/j.cviu.2019.102897_b145 article-title: 3d human pose estimation: A review of the literature and analysis of covariates publication-title: Comput. Vis. Image Underst. doi: 10.1016/j.cviu.2016.09.002 – ident: 10.1016/j.cviu.2019.102897_b123 doi: 10.1109/CVPR.2017.395 – ident: 10.1016/j.cviu.2019.102897_b156 doi: 10.1007/978-3-030-01219-9_21 – ident: 10.1016/j.cviu.2019.102897_b24 doi: 10.1109/CVPR.2016.510 – ident: 10.1016/j.cviu.2019.102897_b25 doi: 10.1109/CVPR.2017.601 – ident: 10.1016/j.cviu.2019.102897_b97 doi: 10.1109/CVPR.2018.00546 – volume: 110 start-page: 70 year: 2014 ident: 10.1016/j.cviu.2019.102897_b15 article-title: Automatic and efficient human pose estimation for sign language videos publication-title: Int. J. Comput. Vis. doi: 10.1007/s11263-013-0672-6 – ident: 10.1016/j.cviu.2019.102897_b43 doi: 10.1109/CVPR.2013.429 – ident: 10.1016/j.cviu.2019.102897_b61 doi: 10.1109/CVPR.2017.495 – ident: 10.1016/j.cviu.2019.102897_b35 doi: 10.1007/978-3-030-01225-0_27 – ident: 10.1016/j.cviu.2019.102897_b106 doi: 10.1109/3DV.2018.00024 – ident: 10.1016/j.cviu.2019.102897_b165 doi: 10.1109/CVPR.2014.214 – volume: 35 start-page: 2878 year: 2013 ident: 10.1016/j.cviu.2019.102897_b180 article-title: Articulated human detection with flexible mixtures of parts publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2012.261 – start-page: 241 year: 2001 ident: 10.1016/j.cviu.2019.102897_b108 – year: 2016 ident: 10.1016/j.cviu.2019.102897_b159
SSID	ssj0011491
Score	2.6780767
Snippet	Vision-based monocular human pose estimation, as one of the most fundamental and challenging problems in computer vision, aims to obtain posture of the human...
SourceID	crossref elsevier
SourceType	Enrichment Source Index Database Publisher
StartPage	102897
SubjectTerms	Deep learning Human pose estimation Survey
Title	Monocular human pose estimation: A survey of deep learning-based methods
URI	https://dx.doi.org/10.1016/j.cviu.2019.102897
Volume	192
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEF6KXvTgoyrWR9mDN1m7aTbZxFsplqjYixZ6C9mXVKQNTVvw4m93NtmUCtKDkNOyG8Jk9psZ-GY-hG6MkQrCpCCKxyGBiOeT2GQZ0ZH2wYkAAMtC8WUYJiP2NA7GDdSve2EsrdJhf4XpJVq7lY6zZiefTDqvULhw32Nw5cBJObcNv4xx6-V332uaB6T7pWqe3Uzsbtc4U3G85GqytPSu2E4wiOzgp7-C00bAGRyhA5cp4l71MceooadNdOiyRuzuZAFLtTBDvdZE-xtTBk9QAvd2VtJNcanIh_NZobGdrlG1Ld7jHi6W85X-wjODldY5dloS78QGOYUrmeniFI0GD2_9hDgBBSJ9xhZEc2piCQ9VGReCChX5BkrpSBvKYx34LPSgoOkyGWvIHJQHYJPZhEMoYzJB_TO0M51N9TnC2hOBZJkCOKRM6DDWEZUU3hOJwCjJWsirLZdKN13cilx8pjWN7CO11k6ttdPK2i10uz6TV7M1tu4O6h-S_vKQFMB_y7mLf567RHtdW1uXfLMrtLOYL_U1JCAL0S49rI12e4_PyfAHk87bAA
linkProvider	Elsevier
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEB5qe1APPqpife7Bm4QmzaZJvJViSe3jYgu9hexLKtKGvsB_72yyKRWkByGnJRvCZPabmey33wA8KcUFhklmCT9sWhjxXCtUSWLJQLroRAiAWaE4GDajMX2beJMStIuzMJpWabA_x_QMrc1I3Viznk6n9XcsXHzXobjk0El9PziAilan8spQaXV70XC7mYBFgJNTD_UvOdowZ2dymhffTNea4RVqEYNAaz_9FZ92Yk7nDE5Mskha-fucQ0nOqnBqEkdiluUSh4reDMVYFY53hAYvIMKlO88YpyRrykfS-VISLbCRn1x8IS2yXC828pvMFRFSpsS0k_iwdJwTJO80vbyEced11I4s00PB4i6lK0v6tgo5XrZIfMZsJgJXYTUdSGX7ofRc2nSwpmlQHkpMHoSDeJPonIMJpRJmu1dQns1n8hqIdJjHaSIQEW3KZDOUgc1tfE7APCU4rYFTWC7mRmBc97n4igsm2WesrR1ra8e5tWvwvJ2T5vIae-_2ig8S_3KSGPF_z7ybf857hMNoNOjH_e6wdwtHDV1qZ_SzOyivFmt5j_nIij0Yf_sBH7TdsQ
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Monocular+human+pose+estimation%3A+A+survey+of+deep+learning-based+methods&rft.jtitle=Computer+vision+and+image+understanding&rft.au=Chen%2C+Yucheng&rft.au=Tian%2C+Yingli&rft.au=He%2C+Mingyi&rft.date=2020-03-01&rft.pub=Elsevier+Inc&rft.issn=1077-3142&rft.eissn=1090-235X&rft.volume=192&rft_id=info:doi/10.1016%2Fj.cviu.2019.102897&rft.externalDocID=S1077314219301778
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1077-3142&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1077-3142&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1077-3142&client=summon