Attention‐aware spatio‐temporal learning for multi‐view gait‐based age estimation and gender classification
Recently, gait‐based age and gender recognition have attracted considerable attention in the fields of advertisement marketing and surveillance retrieval due to the unique advantage that gaits can be perceived at a long distance. Intuitively, age and gender can be recognised by observing people'...
Saved in:
Published in | IET computer vision Vol. 19; no. 1 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
01.01.2025
|
Online Access | Get full text |
Cover
Loading…
Abstract | Recently, gait‐based age and gender recognition have attracted considerable attention in the fields of advertisement marketing and surveillance retrieval due to the unique advantage that gaits can be perceived at a long distance. Intuitively, age and gender can be recognised by observing people's static shape (e.g. different hairstyles between males and females) and dynamic motion (e.g. different walking velocities between the elderly and youth). However, most of the existing gait‐based age and gender recognition methods are based on Gait Energy Image (GEI), which loses the capability of explicitly modelling temporal dynamic information and is not robust to the multi‐view recognition that inevitably happens in a real application. Therefore, in this study, an Attention‐aware Spatio‐Temporal Learning (ASTL) framework is proposed, which employs a silhouette sequence as input to learn essential and invariable spatial‐temporal gait representations. More specifically, a Multi‐Scale Temporal Aggregation (MSTA) module provides an effective scheme for dynamic gait description by exploring and aggregating multi‐scale temporal interval information, which is a core supplement to spatial representation. Then, a Multiple Attention Aggregation (MAA) module is designed to help the network focus on the most discriminatory information along temporal, spatial and channel dimensions. Finally, a Multimodal Collaborative Learning (MCL) block gives full play to the advantages of different modal features through a multimodal cooperative learning strategy. The mean absolute error (MAE) for the age estimation and the correct classification rate (CCR) for the gender classification on OU‐MVLP achieve 6.68 years and 97%, respectively, demonstrating the superiority of the method. Ablation experiments and visualisation results also prove the effectiveness of the three individual modules in their framework. |
---|---|
AbstractList | Recently, gait‐based age and gender recognition have attracted considerable attention in the fields of advertisement marketing and surveillance retrieval due to the unique advantage that gaits can be perceived at a long distance. Intuitively, age and gender can be recognised by observing people's static shape (e.g. different hairstyles between males and females) and dynamic motion (e.g. different walking velocities between the elderly and youth). However, most of the existing gait‐based age and gender recognition methods are based on Gait Energy Image (GEI), which loses the capability of explicitly modelling temporal dynamic information and is not robust to the multi‐view recognition that inevitably happens in a real application. Therefore, in this study, an Attention‐aware Spatio‐Temporal Learning (ASTL) framework is proposed, which employs a silhouette sequence as input to learn essential and invariable spatial‐temporal gait representations. More specifically, a Multi‐Scale Temporal Aggregation (MSTA) module provides an effective scheme for dynamic gait description by exploring and aggregating multi‐scale temporal interval information, which is a core supplement to spatial representation. Then, a Multiple Attention Aggregation (MAA) module is designed to help the network focus on the most discriminatory information along temporal, spatial and channel dimensions. Finally, a Multimodal Collaborative Learning (MCL) block gives full play to the advantages of different modal features through a multimodal cooperative learning strategy. The mean absolute error (MAE) for the age estimation and the correct classification rate (CCR) for the gender classification on OU‐MVLP achieve 6.68 years and 97%, respectively, demonstrating the superiority of the method. Ablation experiments and visualisation results also prove the effectiveness of the three individual modules in their framework. |
Author | Xie, Jiahui Pan, Jiahui Huang, Binyuan Luo, Yongdong Zhou, Chengju |
Author_xml | – sequence: 1 givenname: Binyuan surname: Huang fullname: Huang, Binyuan organization: School of Software South China Normal University Guangzhou China – sequence: 2 givenname: Yongdong surname: Luo fullname: Luo, Yongdong organization: School of Software South China Normal University Guangzhou China – sequence: 3 givenname: Jiahui surname: Xie fullname: Xie, Jiahui organization: School of Software South China Normal University Guangzhou China – sequence: 4 givenname: Jiahui surname: Pan fullname: Pan, Jiahui organization: School of Software South China Normal University Guangzhou China – sequence: 5 givenname: Chengju orcidid: 0000-0003-4948-0909 surname: Zhou fullname: Zhou, Chengju organization: School of Software South China Normal University Guangzhou China |
BookMark | eNptUMFKAzEUDFLBtnrxC3IWtibZ7K45lqJWKHjR8_KavCyRbbYkscWbn-A3-iWmVTyIpzfDvHm8mQkZ-cEjIZeczTiT6lrvnJhxwevqhIx5U_FC1ZKNfnEpzsgkxhfGqlopOSZxnhL65Ab_-f4BewhI4xYyzzThZjsE6GmPELzzHbVDoJvXPrms7hzuaQcuZbyGiIZChxRjcpuD31PwhnboDQaqe4jRWaePyjk5tdBHvPiZU_J8d_u0WBarx_uHxXxV6LIRqeBacw4lNHCz1nXdVIppUzFQBkWj0UpTGWaVVTJnEWwtBWTdWC1LDmhEOSXs-64OQ4wBbatdOn6QAri-5aw9lNYeSmuPpWXL1R_LNuQ84e2_5S9s03ee |
CitedBy_id | crossref_primary_10_1016_j_array_2025_100379 crossref_primary_10_1016_j_eswa_2024_123843 crossref_primary_10_3233_THC_235012 |
Cites_doi | 10.1609/aaai.v33i01.33018126 10.1109/ICIP.2017.8296252 10.1186/s41074‐017‐0035‐2 10.1007/978-981-16-8225-4_22 10.22161/ijaers.88.52 10.1186/s41074‐019‐0053‐3 10.1109/CVPR.2018.00745 10.1016/j.patcog.2019.04.023 10.1109/tpami.2016.2545669 10.1007/978-3-030-58545-7_22 10.1109/CVPR.2018.00813 10.1109/tpami.2022.3183288 10.1016/j.patcog.2022.108797 10.1109/ICB45273.2019.8987240 10.1109/IJCB48548.2020.9304914 10.1007/978-3-030-01234-2_1 10.1007/978-981-10-4765-7_34 10.1007/s11432‐019‐2733‐4 10.1214/aoms/1177729694 10.1109/tpami.2019.2938758 10.1109/CVPR42600.2020.01423 10.1109/WACV48630.2021.00350 10.1088/1742‐6596/2010/1/012031 10.1109/ICB.2016.7550060 10.1109/ICPR.2010.934 10.1109/TIP.2022.3164543 |
ContentType | Journal Article |
DBID | AAYXX CITATION |
DOI | 10.1049/cvi2.12165 |
DatabaseName | CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | CrossRef |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Applied Sciences |
EISSN | 1751-9640 |
ExternalDocumentID | 10_1049_cvi2_12165 |
GroupedDBID | .DC 0R~ 0ZK 1OC 24P 29I 5GY 6IK 8FE 8FG 8VB AAHJG AAJGR AAMMB AAYXX ABJCF ABQXS ABUWG ACCMX ACESK ACGFO ACGFS ACIWK ACXQS ADEYR AEFGJ AEGXH AENEX AFKRA AGXDD AIDQK AIDYY ALMA_UNASSIGNED_HOLDINGS ALUQN ARAPS AVUZU AZQEC BENPR BGLVJ BPHCQ CCPQU CITATION CS3 DU5 DWQXO EBS EJD GNUQQ GROUPED_DOAJ HCIFZ HZ~ IAO IDLOA IPLJI ITC J9A K1G K6V K7- L6V LAI M43 M7S MCNEO MS~ O9- OK1 P62 PHGZM PHGZT PQGLB PQQKQ PROAC PTHSS PUEGO QWB RNS RUI S0W UNMZH WIN ZL0 ~ZZ |
ID | FETCH-LOGICAL-c372t-1cc11a3a7a8bc667590cd50a9de27cef4d5d0f9f9456920b42acd5dfc431aed23 |
ISSN | 1751-9632 |
IngestDate | Thu Apr 24 23:00:09 EDT 2025 Wed Aug 27 16:38:59 EDT 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 1 |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c372t-1cc11a3a7a8bc667590cd50a9de27cef4d5d0f9f9456920b42acd5dfc431aed23 |
ORCID | 0000-0003-4948-0909 |
OpenAccessLink | https://onlinelibrary.wiley.com/doi/pdf/10.1049/cvi2.12165 |
ParticipantIDs | crossref_citationtrail_10_1049_cvi2_12165 crossref_primary_10_1049_cvi2_12165 |
PublicationCentury | 2000 |
PublicationDate | 2025-01-01 |
PublicationDateYYYYMMDD | 2025-01-01 |
PublicationDate_xml | – month: 01 year: 2025 text: 2025-01-01 day: 01 |
PublicationDecade | 2020 |
PublicationTitle | IET computer vision |
PublicationYear | 2025 |
References | e_1_2_11_10_1 e_1_2_11_31_1 e_1_2_11_30_1 e_1_2_11_14_1 e_1_2_11_13_1 e_1_2_11_12_1 e_1_2_11_11_1 e_1_2_11_7_1 e_1_2_11_29_1 e_1_2_11_6_1 e_1_2_11_28_1 e_1_2_11_5_1 e_1_2_11_27_1 e_1_2_11_4_1 e_1_2_11_26_1 e_1_2_11_3_1 e_1_2_11_2_1 Sakata A. (e_1_2_11_18_1) 2018 e_1_2_11_21_1 e_1_2_11_20_1 e_1_2_11_24_1 e_1_2_11_9_1 e_1_2_11_23_1 e_1_2_11_8_1 e_1_2_11_22_1 e_1_2_11_17_1 e_1_2_11_16_1 e_1_2_11_15_1 e_1_2_11_19_1 Takemura N. (e_1_2_11_32_1) 2018; 10 Xu C. (e_1_2_11_25_1) 2017; 9 |
References_xml | – ident: e_1_2_11_2_1 doi: 10.1609/aaai.v33i01.33018126 – ident: e_1_2_11_11_1 doi: 10.1109/ICIP.2017.8296252 – ident: e_1_2_11_5_1 doi: 10.1186/s41074‐017‐0035‐2 – ident: e_1_2_11_21_1 doi: 10.1007/978-981-16-8225-4_22 – ident: e_1_2_11_20_1 doi: 10.22161/ijaers.88.52 – ident: e_1_2_11_24_1 doi: 10.1186/s41074‐019‐0053‐3 – ident: e_1_2_11_9_1 doi: 10.1109/CVPR.2018.00745 – ident: e_1_2_11_10_1 doi: 10.1016/j.patcog.2019.04.023 – ident: e_1_2_11_13_1 doi: 10.1109/tpami.2016.2545669 – ident: e_1_2_11_29_1 doi: 10.1007/978-3-030-58545-7_22 – ident: e_1_2_11_6_1 doi: 10.1109/CVPR.2018.00813 – ident: e_1_2_11_26_1 doi: 10.1109/tpami.2022.3183288 – ident: e_1_2_11_28_1 doi: 10.1016/j.patcog.2022.108797 – ident: e_1_2_11_8_1 – ident: e_1_2_11_15_1 doi: 10.1109/ICB45273.2019.8987240 – ident: e_1_2_11_30_1 doi: 10.1109/IJCB48548.2020.9304914 – ident: e_1_2_11_7_1 doi: 10.1007/978-3-030-01234-2_1 – ident: e_1_2_11_23_1 doi: 10.1007/978-981-10-4765-7_34 – ident: e_1_2_11_14_1 doi: 10.1007/s11432‐019‐2733‐4 – ident: e_1_2_11_31_1 doi: 10.1214/aoms/1177729694 – ident: e_1_2_11_27_1 doi: 10.1109/tpami.2019.2938758 – ident: e_1_2_11_3_1 doi: 10.1109/CVPR42600.2020.01423 – start-page: 55 volume-title: Asian Conference on Computer Vision year: 2018 ident: e_1_2_11_18_1 – ident: e_1_2_11_12_1 doi: 10.1109/WACV48630.2021.00350 – ident: e_1_2_11_19_1 doi: 10.1088/1742‐6596/2010/1/012031 – ident: e_1_2_11_16_1 doi: 10.1109/ICB.2016.7550060 – volume: 9 start-page: 1 issue: 1 year: 2017 ident: e_1_2_11_25_1 article-title: The OU‐ISIR gait database comprising the large population dataset with age and performance evaluation of age estimation publication-title: IPSJ Trans. Computer Vision Appl. – ident: e_1_2_11_22_1 doi: 10.1109/ICPR.2010.934 – ident: e_1_2_11_4_1 doi: 10.1109/TIP.2022.3164543 – volume: 10 start-page: 1 issue: 1 year: 2018 ident: e_1_2_11_32_1 article-title: Multi‐view large population gait dataset and its performance evaluation for cross‐view gait recognition publication-title: IPSJ Trans. Computer Vision Appl. – ident: e_1_2_11_17_1 |
SSID | ssj0056994 |
Score | 2.3613236 |
Snippet | Recently, gait‐based age and gender recognition have attracted considerable attention in the fields of advertisement marketing and surveillance retrieval due... |
SourceID | crossref |
SourceType | Enrichment Source Index Database |
Title | Attention‐aware spatio‐temporal learning for multi‐view gait‐based age estimation and gender classification |
Volume | 19 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3Nb9MwFLdKd-HCxwAxGMjSuKAq0Dh2Uh8L2rTtwKlFhUvl2E6JNKUTTUBw2nHH_Y38JTx_xMm6HQaXKH1x0qrvl-dnv997D6E3JGWUmtBgQWMdUU50xDW8VzQ2xTdjBUsSW-3zU3o8p6cLthgMLnuspabO38nft-aV_I9WQQZ6NVmy_6DZ8FAQwDnoF46gYTjeScfTunZsxUBZED8Nk2tjadJB6MtPnbUtIhx10lIJwxibwLISZR0kZn5TI0PpMYU4XIajDTWsbPe5kTR-tyEadbr1Tu7J4cxS1U23iJFLXu_Q4_enP5TVr6bHB2rslu2XdbVSaz-ZgnjhoienpfjWlF20q9oS-l0Lwnq7Fs7QZiyO4OV3llj3Za58U7DOfBuFN4w-LHJAU_JHSUytDNd64npl7a0ZL_AQbQSe8qW5d2nvvYd2ALmEDNHO9PP867yd1VnKbVPN8LvbUreUv---uefc9LyU2SP0wC8v8NRh5TEa6GoXPfRLDewN-eYJ2gTo_Lm4sqDBDjTwsYULbuGCAS7YwgWuGqBgAxQ4txDBABHcQQQDRLCDCL4OkadofnQ4-3gc-f4bkUwyUkexlHEsEpGJSS5TWFnysVRsLLjSJJO6oIqpccELDk44J-OcEgHXVSHBKRVakeQZGlbrSj9HuJhMksLMDWACaJqoXCaUKSkyRlORU7qH3rZ_3FL64vSmR8rZ8qaK9tBBGHvuSrLcMurFnUa9RPc7dO6jYf290a_Ay6zz1x4AfwGh_4Ml |
linkProvider | Wiley-Blackwell |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Attention%E2%80%90aware+spatio%E2%80%90temporal+learning+for+multi%E2%80%90view+gait%E2%80%90based+age+estimation+and+gender+classification&rft.jtitle=IET+computer+vision&rft.au=Huang%2C+Binyuan&rft.au=Luo%2C+Yongdong&rft.au=Xie%2C+Jiahui&rft.au=Pan%2C+Jiahui&rft.date=2025-01-01&rft.issn=1751-9632&rft.eissn=1751-9640&rft.volume=19&rft.issue=1&rft_id=info:doi/10.1049%2Fcvi2.12165&rft.externalDBID=n%2Fa&rft.externalDocID=10_1049_cvi2_12165 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1751-9632&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1751-9632&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1751-9632&client=summon |