EHPE: Skeleton Cues-Based Gaussian Coordinate Encoding for Efficient Human Pose Estimation
Human pose estimation (HPE) has many wide applications such as multimedia processing, behavior understanding and human-computer interaction. Most previous studies have encountered many constraints, such as restricted scenarios and RGB inputs. To mitigate constraints to estimating the human poses in...
Saved in:
Published in | IEEE transactions on multimedia Vol. 26; pp. 8464 - 8475 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
IEEE
01.01.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Human pose estimation (HPE) has many wide applications such as multimedia processing, behavior understanding and human-computer interaction. Most previous studies have encountered many constraints, such as restricted scenarios and RGB inputs. To mitigate constraints to estimating the human poses in general scenarios, we present an efficient human pose estimation model (i.e., EHPE) with joint direction cues and Gaussian coordinate encoding. Specifically, we propose an anisotropic Gaussian coordinate coding method to describe the skeleton direction cues among adjacent keypoints. To the best of our knowledge, this is the first time that the skeleton direction cues is introduced to the heatmap encoding in HPE task. Then, a multi-loss function is proposed to constrain the output to prevent the overfitting. The Kullback-Leibler divergence is introduced to measure the predication label and its ground truth one. The performance of EHPE is evaluated on two HPE datasets: MS COCO and MPII. Experimental results demonstrate that EHPE can obtain robust results, and it significantly outperforms existing state-of-the-art HPE methods. Lastly, we extend the experiments on infrared images captured by our research group. The experiments achieved the impressive results regardless of insufficient color and texture information. |
---|---|
AbstractList | Human pose estimation (HPE) has many wide applications such as multimedia processing, behavior understanding and human-computer interaction. Most previous studies have encountered many constraints, such as restricted scenarios and RGB inputs. To mitigate constraints to estimating the human poses in general scenarios, we present an efficient human pose estimation model (i.e., EHPE) with joint direction cues and Gaussian coordinate encoding. Specifically, we propose an anisotropic Gaussian coordinate coding method to describe the skeleton direction cues among adjacent keypoints. To the best of our knowledge, this is the first time that the skeleton direction cues is introduced to the heatmap encoding in HPE task. Then, a multi-loss function is proposed to constrain the output to prevent the overfitting. The Kullback-Leibler divergence is introduced to measure the predication label and its ground truth one. The performance of EHPE is evaluated on two HPE datasets: MS COCO and MPII. Experimental results demonstrate that EHPE can obtain robust results, and it significantly outperforms existing state-of-the-art HPE methods. Lastly, we extend the experiments on infrared images captured by our research group. The experiments achieved the impressive results regardless of insufficient color and texture information. |
Author | Zhang, Zhaoli Chen, Yu Liu, Tingting Li, You-Fu Liu, Hai |
Author_xml | – sequence: 1 givenname: Hai orcidid: 0000-0003-3446-9301 surname: Liu fullname: Liu, Hai email: hailiu0204@ccnu.edu.cn – sequence: 2 givenname: Tingting orcidid: 0000-0002-9347-5974 surname: Liu fullname: Liu, Tingting email: tliu@hubu.edu.cn – sequence: 3 givenname: Yu orcidid: 0000-0003-1824-5105 surname: Chen fullname: Chen, Yu email: cxxx912@mails.ccnu.edu.cn – sequence: 4 givenname: Zhaoli orcidid: 0000-0002-0844-0719 surname: Zhang fullname: Zhang, Zhaoli email: zl.zhang@ccnu.edu.cn – sequence: 5 givenname: You-Fu orcidid: 0000-0002-5227-1326 surname: Li fullname: Li, You-Fu email: meyfli@cityu.edu.hk |
BookMark | eNo9kE9LAzEUxINUsK3eBS_5Altf_m2MNy3bVmixYL14WdLsi0TbjWy2B7-9KS2eZnjMPIbfiAza2CIhtwwmjIG536xWEw6cTwQzWpTyggyZkawA0HqQveJQGM7gioxS-gJgUoEeko9qsa4e6ds37rCPLZ0eMBXPNmFD5_aQUrD5FmPXhNb2SKvWxWw_qY8drbwPLmDb08Vhn3PrmHIi9WFv-xDba3Lp7S7hzVnH5H1WbaaLYvk6f5k-LQsnQPUFct9oyaTQZZMXGyiZBMdkKYVRjTboOCihlNx6AJR-yyzz1oJwZaNF1jGB01_XxZQ69PVPlyd0vzWD-simzmzqI5v6zCZX7k6VgIj_cfOgODdG_AF8LmCn |
CODEN | ITMUF8 |
CitedBy_id | crossref_primary_10_3390_s23229102 crossref_primary_10_1007_s00607_023_01247_w crossref_primary_10_1016_j_inffus_2023_102155 crossref_primary_10_1007_s10489_023_04614_4 crossref_primary_10_3390_app14114863 crossref_primary_10_1016_j_infrared_2022_104348 crossref_primary_10_1007_s10489_023_04658_6 crossref_primary_10_1007_s10489_023_05140_z crossref_primary_10_3390_s24103036 crossref_primary_10_1007_s00500_023_09295_2 crossref_primary_10_1007_s10489_023_04734_x crossref_primary_10_1016_j_jnlest_2024_100260 crossref_primary_10_1007_s10489_023_04714_1 crossref_primary_10_1007_s10489_023_04795_y crossref_primary_10_1007_s00371_023_03184_3 crossref_primary_10_3390_electronics11203403 crossref_primary_10_1007_s10489_023_04870_4 crossref_primary_10_3390_s24010197 crossref_primary_10_1007_s10489_023_04587_4 crossref_primary_10_1007_s10489_023_04589_2 crossref_primary_10_1007_s10489_023_04750_x crossref_primary_10_3390_computers12080151 crossref_primary_10_1016_j_displa_2023_102583 crossref_primary_10_1016_j_neucom_2022_10_031 crossref_primary_10_3390_bdcc7010037 crossref_primary_10_1007_s10489_023_04689_z crossref_primary_10_3233_JIFS_230165 crossref_primary_10_3390_en17020349 crossref_primary_10_1016_j_displa_2023_102622 crossref_primary_10_1007_s10489_023_04695_1 crossref_primary_10_1142_S0218001423540186 crossref_primary_10_3390_s23052613 crossref_primary_10_1109_TKDE_2022_3221873 crossref_primary_10_3390_s23208605 crossref_primary_10_1016_j_patcog_2023_110084 crossref_primary_10_1007_s10489_023_04726_x crossref_primary_10_1016_j_infrared_2022_104482 crossref_primary_10_3390_electronics12102265 crossref_primary_10_3390_app13085025 crossref_primary_10_1109_TIP_2023_3340604 crossref_primary_10_1016_j_ipm_2023_103351 crossref_primary_10_1016_j_infrared_2023_104723 crossref_primary_10_1007_s10489_023_04760_9 crossref_primary_10_1007_s10489_023_04952_3 crossref_primary_10_1016_j_eswa_2023_119890 crossref_primary_10_1007_s10489_023_04499_3 crossref_primary_10_1007_s10489_023_04687_1 crossref_primary_10_1007_s10489_023_05067_5 crossref_primary_10_3390_rs15051267 crossref_primary_10_3934_era_2024098 crossref_primary_10_1016_j_engappai_2023_107440 crossref_primary_10_3390_s23239532 crossref_primary_10_3934_math_20231075 crossref_primary_10_1007_s10489_023_05023_3 crossref_primary_10_1016_j_dsp_2023_104272 crossref_primary_10_3390_s22197123 crossref_primary_10_1007_s10489_023_04731_0 crossref_primary_10_1016_j_infrared_2023_104850 crossref_primary_10_3390_app14104259 crossref_primary_10_1007_s10489_023_05125_y crossref_primary_10_1007_s10489_023_04740_z crossref_primary_10_3390_sci5010010 crossref_primary_10_1007_s10489_023_04633_1 crossref_primary_10_1007_s13735_023_00283_8 crossref_primary_10_1016_j_engappai_2023_106360 crossref_primary_10_1016_j_knosys_2023_111243 |
Cites_doi | 10.1007/s11263-021-01482-8 10.1109/ICCV.2017.256 10.1109/ICHMS49158.2020.9209510 10.1007/978-3-030-01231-1_29 10.1109/TMM.2020.2999181 10.1109/TIP.2015.2507445 10.1109/TCSVT.2020.2965574 10.1109/ICCV.2017.228 10.1007/11744047_45 10.1109/CVPR.2017.395 10.1109/CVPR42600.2020.00574 10.1109/TPAMI.2018.2885472 10.1109/CVPR46437.2021.01306 10.1109/CVPR.2014.214 10.1109/CVPR42600.2020.00706 10.1609/aaai.v34i03.5652 10.1016/j.neucom.2020.09.068 10.1109/TCSVT.2021.3071621 10.1109/TPAMI.2019.2929257 10.1109/TMM.2019.2903455 10.1109/CVPR.2017.601 10.1109/CVPR.2018.00224 10.1109/TIP.2023.3331309 10.1007/978-3-030-01216-8_44 10.1109/CVPR.2018.00742 10.1109/TII.2022.3143605 10.1016/j.neucom.2020.12.090 10.1109/CVPR.2016.511 10.1109/CVPR.2019.01112 10.1109/ICCV.2017.322 10.1109/CVPR.2016.512 10.1109/ICCV.2017.329 10.1109/CVPR.2019.00584 10.1007/978-3-319-46466-4_3 10.1109/CVPR42600.2020.00046 10.1609/aaai.v32i1.12328 10.1007/978-3-319-46484-8_29 10.1109/CVPR42600.2020.00712 10.1109/CVPR.2014.471 10.1109/TMM.2017.2762010 10.1007/978-3-319-10602-1_48 |
ContentType | Journal Article |
DBID | 97E ESBDL RIA RIE AAYXX CITATION |
DOI | 10.1109/TMM.2022.3197364 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005-present IEEE Open Access Journals IEEE All-Society Periodicals Package (ASPP) 1998-Present IEEE Electronic Library Online CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library Online url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Computer Science |
EISSN | 1941-0077 |
EndPage | 8475 |
ExternalDocumentID | 10_1109_TMM_2022_3197364 9852299 |
Genre | orig-research |
GrantInformation_xml | – fundername: National Key R&D Program of China grantid: 2021YFC3340802 – fundername: University Teaching Reform Research Project of Jiangxi Province grantid: JXJG-23-27-6 – fundername: National Natural Science Foundation of Hubei Province grantid: 2022CFB971 – fundername: Jiangxi Provincial Natural Science Foundation grantid: 20232BAB212026 – fundername: Research Grants Council of Hong Kong grantid: 9043323; CityU 11213420 – fundername: Science and Technology Development Fund, Macau grantid: 0022/2019/AKP – fundername: Shenzhen Science and Technology Program grantid: JCYJ20230807152900001 – fundername: National Natural Science Foundation of China grantid: 62277041; 62077020; 62173286; 62211530433; 62177018 funderid: 10.13039/501100001809 |
GroupedDBID | -~X 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AASAJ ABQJQ ABTAH ACGFO ACGFS ACIWK AENEX AETIX AI. AIBXA AKJIK ALLEH ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD ESBDL HZ~ H~9 IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P PQQKQ RIA RIE RNS TN5 VH1 ZY4 AAYXX AGSQL CITATION |
ID | FETCH-LOGICAL-c305t-e2fd7414376d194906140c1464395d79ec2053554bf00e4fb1a1faa03c6d73a03 |
IEDL.DBID | RIE |
ISSN | 1520-9210 |
IngestDate | Fri Dec 06 08:57:15 EST 2024 Mon Nov 04 11:48:45 EST 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
License | https://creativecommons.org/licenses/by/4.0/legalcode |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c305t-e2fd7414376d194906140c1464395d79ec2053554bf00e4fb1a1faa03c6d73a03 |
ORCID | 0000-0002-5227-1326 0000-0003-3446-9301 0000-0002-9347-5974 0000-0003-1824-5105 0000-0002-0844-0719 |
OpenAccessLink | https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/document/9852299 |
PageCount | 12 |
ParticipantIDs | crossref_primary_10_1109_TMM_2022_3197364 ieee_primary_9852299 |
PublicationCentury | 2000 |
PublicationDate | 2024-01-01 |
PublicationDateYYYYMMDD | 2024-01-01 |
PublicationDate_xml | – month: 01 year: 2024 text: 2024-01-01 day: 01 |
PublicationDecade | 2020 |
PublicationTitle | IEEE transactions on multimedia |
PublicationTitleAbbrev | TMM |
PublicationYear | 2024 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
References | ref13 ref35 ref12 ref34 ref15 ref37 ref14 ref31 ref30 ref11 ref33 ref10 ref32 ref2 ref1 ref17 ref39 ref16 ref38 ref19 Simonyan (ref36) 2015 ref18 Tompson (ref20) ref24 ref26 ref25 ref42 ref41 ref22 ref44 ref21 ref43 ref28 ref27 ref29 ref8 ref7 ref9 ref4 ref3 ref6 ref5 ref40 Tompson (ref23) |
References_xml | – ident: ref17 doi: 10.1007/s11263-021-01482-8 – ident: ref24 doi: 10.1109/ICCV.2017.256 – ident: ref5 doi: 10.1109/ICHMS49158.2020.9209510 – ident: ref27 doi: 10.1007/978-3-030-01231-1_29 – year: 2015 ident: ref36 article-title: Very deep contributor: fullname: Simonyan – ident: ref1 doi: 10.1109/TMM.2020.2999181 – ident: ref13 doi: 10.1109/TIP.2015.2507445 – ident: ref16 doi: 10.1109/TCSVT.2020.2965574 – ident: ref38 doi: 10.1109/ICCV.2017.228 – ident: ref37 doi: 10.1007/11744047_45 – ident: ref11 doi: 10.1109/CVPR.2017.395 – ident: ref35 doi: 10.1109/CVPR42600.2020.00574 – ident: ref18 doi: 10.1109/TPAMI.2018.2885472 – start-page: 1799 volume-title: Proc. Adv. Neural Inf. Process. Syst. ident: ref23 article-title: Joint training of a convolutional network and a graphical model for human pose estimation contributor: fullname: Tompson – ident: ref6 doi: 10.1109/CVPR46437.2021.01306 – ident: ref19 doi: 10.1109/CVPR.2014.214 – ident: ref30 doi: 10.1109/CVPR42600.2020.00706 – ident: ref8 doi: 10.1609/aaai.v34i03.5652 – ident: ref7 doi: 10.1016/j.neucom.2020.09.068 – ident: ref10 doi: 10.1109/TCSVT.2021.3071621 – ident: ref42 doi: 10.1109/TPAMI.2019.2929257 – ident: ref2 doi: 10.1109/TMM.2019.2903455 – ident: ref21 doi: 10.1109/CVPR.2017.601 – ident: ref26 doi: 10.1109/CVPR.2018.00224 – ident: ref44 doi: 10.1109/TIP.2023.3331309 – ident: ref22 doi: 10.1007/978-3-030-01216-8_44 – ident: ref14 doi: 10.1109/CVPR.2018.00742 – ident: ref29 doi: 10.1109/TII.2022.3143605 – ident: ref32 doi: 10.1016/j.neucom.2020.12.090 – ident: ref33 doi: 10.1109/CVPR.2016.511 – ident: ref25 doi: 10.1109/CVPR.2019.01112 – ident: ref41 doi: 10.1109/ICCV.2017.322 – ident: ref31 doi: 10.1109/CVPR.2016.512 – ident: ref43 doi: 10.1109/ICCV.2017.329 – ident: ref15 doi: 10.1109/CVPR.2019.00584 – ident: ref12 doi: 10.1007/978-3-319-46466-4_3 – ident: ref4 doi: 10.1109/CVPR42600.2020.00046 – ident: ref9 doi: 10.1609/aaai.v32i1.12328 – ident: ref28 doi: 10.1007/978-3-319-46484-8_29 – volume-title: Proc. 27th Int. Conf. Neural Inf. Process. Syst. ident: ref20 article-title: Joint training of a convolutional network and a graphical model for human pose estimation contributor: fullname: Tompson – ident: ref34 doi: 10.1109/CVPR42600.2020.00712 – ident: ref40 doi: 10.1109/CVPR.2014.471 – ident: ref3 doi: 10.1109/TMM.2017.2762010 – ident: ref39 doi: 10.1007/978-3-319-10602-1_48 |
SSID | ssj0014507 |
Score | 2.6399918 |
Snippet | Human pose estimation (HPE) has many wide applications such as multimedia processing, behavior understanding and human-computer interaction. Most previous... |
SourceID | crossref ieee |
SourceType | Aggregation Database Publisher |
StartPage | 8464 |
SubjectTerms | Biological system modeling Deep learning Encoding Feature extraction gaussian coordinate encoding Heating systems human pose estimation Pose estimation regularization Skeleton skeleton direction Task analysis |
Title | EHPE: Skeleton Cues-Based Gaussian Coordinate Encoding for Efficient Human Pose Estimation |
URI | https://ieeexplore.ieee.org/document/9852299 |
Volume | 26 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFH5sO-nB6aY4f5GDF8FuaZeuxpuOziFUBm4wvJQ0SUEGrbj24l_vS9qNKR48tZQUQl7yvu_Ly3sBuJY4BxSXgSNQXqBAUdRB3kwdpLbCDYQSwje5w9HLaLpgz0t_2YDbbS6M1toePtN982pj-SqXpdkqG_A7ZAucN6EZ8KDK1dpGDJhvU6MRjqjDUcdsQpKUD-ZRhELQ81Cf8mA4Yj8gaOdOFQspkzZEm85UJ0lW_bJI-vLrV53G__b2EA5qbkkeqslwBA2ddaC9ubeB1Mu4A_s7RQi78BZOZ-E9eV0hACERJGMECucRwU2RJ1GuTZIlGeeoUd8z5KUkzGRu8I4g2yWhLUCB3SA2GEBm-RpboNeoEiKPYTEJ5-OpU9-44Ehc94WjvVQhxWDodZTLGTdykUp0pkhbfBVwLT1TD8ZnSUqpZmniCjcVgg7lSAVDfJ5AK8szfQrEk0nqu4opmros8VPDgzSSJTewXoT34GZjhPijKqwRW0FCeYwGi43B4tpgPeia4d22q0f27O_P57CHP7Nqn-QCWsVnqS-RORTJlZ0y3-hjvhE |
link.rule.ids | 315,781,785,797,27929,27930,54763 |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFH7MeVAPzp84f-bgRbAz7dLVeNPROXUVwQnipaRJCjJoxa0X_3pf0nZM8eCppYQS8pL3fV9e3gvAqcQ5oLgMHIHyAgWKog7yZuogtRVuIJQQvskdjh57wxd2_-q_NuB8ngujtbaHz3THvNpYvsplYbbKLvglsgXOl2DZZ7guymytecyA-TY5GgGJOhyVTB2UpPxiHEUoBT0PFSoPuj32A4QWblWxoDJoQVR3pzxLMukUs6Qjv35VavxvfzdgvWKX5LqcDpvQ0NkWtOqbG0i1kLdgbaEM4Ta8hcOn8Io8TxCCkAqSPkKFc4PwpsitKKYmzZL0c1Sp7xkyUxJmMjeIR5DvktCWoMBuEBsOIE_5FFug3yhTInfgZRCO-0OnunPBkbjyZ472UoUkg6HfUS5n3AhGKtGdInHxVcC19ExFGJ8lKaWapYkr3FQI2pU9FXTxuQvNLM_0HhBPJqnvKqZo6rLETw0T0kiX3MD6Ed6Gs9oI8UdZWiO2koTyGA0WG4PFlcHasG2Gd96uGtn9vz-fwMpwHI3i0d3jwwGs4o9YuWtyCM3ZZ6GPkEfMkmM7fb4BD0vBXw |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=EHPE%3A+Skeleton+Cues-Based+Gaussian+Coordinate+Encoding+for+Efficient+Human+Pose+Estimation&rft.jtitle=IEEE+transactions+on+multimedia&rft.au=Liu%2C+Hai&rft.au=Liu%2C+Tingting&rft.au=Chen%2C+Yu&rft.au=Zhang%2C+Zhaoli&rft.date=2024-01-01&rft.pub=IEEE&rft.issn=1520-9210&rft.volume=26&rft.spage=8464&rft.epage=8475&rft_id=info:doi/10.1109%2FTMM.2022.3197364&rft.externalDocID=9852299 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1520-9210&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1520-9210&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1520-9210&client=summon |