面向属性识别和组合检索的区域感知时尚对比学习
TP391.4; 服装属性识别已成为一项关键技术,使用户能够自动识别服装的特征,并搜索具有相似属性的服装图片.然而,现有方法无法识别新添加的属性,并且可能无法捕获区域级别视觉特征.为解决上述问题,该研究提出一种区域感知时尚对比语言图像预训练(region-aware fashion contrastive language-image pre-training,RaF-CLIP)模型.该模型将裁剪和分割的图像与类别和多个细粒度属性文本进行对齐,通过对比学习实现时尚区域与相应文本的匹配.服装检索基于用户指定的服装类别和属性来找到合适的服装,为进一步提高检索的准确性,该研究在RaF-CLIP模型上...
Saved in:
Published in | 东华大学学报(英文版) Vol. 41; no. 4; pp. 405 - 415 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | Chinese |
Published |
东华大学信息科学与技术学院,上海 201620
2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | TP391.4; 服装属性识别已成为一项关键技术,使用户能够自动识别服装的特征,并搜索具有相似属性的服装图片.然而,现有方法无法识别新添加的属性,并且可能无法捕获区域级别视觉特征.为解决上述问题,该研究提出一种区域感知时尚对比语言图像预训练(region-aware fashion contrastive language-image pre-training,RaF-CLIP)模型.该模型将裁剪和分割的图像与类别和多个细粒度属性文本进行对齐,通过对比学习实现时尚区域与相应文本的匹配.服装检索基于用户指定的服装类别和属性来找到合适的服装,为进一步提高检索的准确性,该研究在RaF-CLIP模型上引入属性引导的组合网络(attribute-guided composed network,AGCN),并将其作为附加组件,专用于组合图像检索任务.该任务旨在根据文本表达修改参考图像以检索预期的目标.通过采用基于transformer的双向注意力和门控机制,该网络实现了图像特征和属性文本特征的融合与选择.试验结果表明,所提出的模型在属性识别任务中平均精度达到0.663 3,在组合图像检索任务中recall@10(recall@k表示正确样本出现在前k个检索结果中的百分比)指标达到39.18,满足用户通过图像和文本自由搜索服装的需求. |
---|---|
AbstractList | TP391.4; 服装属性识别已成为一项关键技术,使用户能够自动识别服装的特征,并搜索具有相似属性的服装图片.然而,现有方法无法识别新添加的属性,并且可能无法捕获区域级别视觉特征.为解决上述问题,该研究提出一种区域感知时尚对比语言图像预训练(region-aware fashion contrastive language-image pre-training,RaF-CLIP)模型.该模型将裁剪和分割的图像与类别和多个细粒度属性文本进行对齐,通过对比学习实现时尚区域与相应文本的匹配.服装检索基于用户指定的服装类别和属性来找到合适的服装,为进一步提高检索的准确性,该研究在RaF-CLIP模型上引入属性引导的组合网络(attribute-guided composed network,AGCN),并将其作为附加组件,专用于组合图像检索任务.该任务旨在根据文本表达修改参考图像以检索预期的目标.通过采用基于transformer的双向注意力和门控机制,该网络实现了图像特征和属性文本特征的融合与选择.试验结果表明,所提出的模型在属性识别任务中平均精度达到0.663 3,在组合图像检索任务中recall@10(recall@k表示正确样本出现在前k个检索结果中的百分比)指标达到39.18,满足用户通过图像和文本自由搜索服装的需求. |
Abstract_FL | Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing methods cannot recognize newly added attributes and may fail to capture region-level visual features.To address the aforementioned issues,a region-aware fashion contrastive language-image pre-training(RaF-CLIP)model was proposed.This model aligned cropped and segmented images with category and multiple fine-grained attribute texts,achieving the matching of fashion region and corresponding texts through contrastive learning.Clothing retrieval found suitable clothing based on the user-specified clothing categories and attributes,and to further improve the accuracy of retrieval,an attribute-guided composed network(AGCN)as an additional component on RaF-CLIP was introduced,specifically designed for composed image retrieval.This task aimed to modify the reference image based on textual expressions to retrieve the expected target.By adopting a transformer-based bidirectional attention and gating mechanism,it realized the fusion and selection of image features and attribute text features.Experimental results show that the proposed model achieves a mean precision of 0.663 3 for attribute recognition tasks and a recall@10(recall@k is defined as the percentage of correct samples appearing in the top k retrieval results)of 39.18 for composed image retrieval task,satisfying user needs for freely searching for clothing through images and texts. |
Author | 王康平 赵鸣博 |
AuthorAffiliation | 东华大学信息科学与技术学院,上海 201620 |
AuthorAffiliation_xml | – name: 东华大学信息科学与技术学院,上海 201620 |
Author_FL | ZHAO Mingbo WANG Kangping |
Author_FL_xml | – sequence: 1 fullname: WANG Kangping – sequence: 2 fullname: ZHAO Mingbo |
Author_xml | – sequence: 1 fullname: 王康平 – sequence: 2 fullname: 赵鸣博 |
BookMark | eNrjYmDJy89LZWCQMzTQM7S0sDDRz9IzNDM30jU1MjLQMzIwMjEwNTAwY2HghItyMPAWF2cmGRgYGpoBxUw4GZxezl30dMLEpxvnPWtY_mJ929OO1U8n9Tzf3fJ0QsezxQ3Ptyx6Pqvlac-up_PnP2uZ_3z-0mfTtz3dMOvp-p3P1k95unbZk50LeBhY0xJzilN5oTQ3g6aba4izh255Yl5aYl56fFZ-aVEeUCY-JSOloiIpPhXsNhMDA3NjUtQCAJ_KXO4 |
ClassificationCodes | TP391.4 |
ContentType | Journal Article |
Copyright | Copyright © Wanfang Data Co. Ltd. All Rights Reserved. |
Copyright_xml | – notice: Copyright © Wanfang Data Co. Ltd. All Rights Reserved. |
DBID | 2B. 4A8 92I 93N PSX TCJ |
DOI | 10.19884/j.1672-5220.202405006 |
DatabaseName | Wanfang Data Journals - Hong Kong WANFANG Data Centre Wanfang Data Journals 万方数据期刊 - 香港版 China Online Journals (COJ) China Online Journals (COJ) |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
DocumentTitle_FL | Region-Aware Fashion Contrastive Learning for Unified Attribute Recognition and Composed Retrieval |
EndPage | 415 |
ExternalDocumentID | dhdxxb_e202404007 |
GroupedDBID | -02 -0B -SB -S~ 188 2B. 4A8 5VR 5XA 5XC 8RM 92D 92I 92M 93N 9D9 9DB ABJNI ACGFS ADMLS AFUIB ALMA_UNASSIGNED_HOLDINGS CAJEB CCEZO CDRFL CHBEP CW9 FA0 JUIAU PSX Q-- R-B RT2 S.. T8R TCJ TGH TTC U1F U1G U5B U5L UGNYK UZ2 UZ4 |
ID | FETCH-wanfang_journals_dhdxxb_e2024040073 |
ISSN | 1672-5220 |
IngestDate | Thu May 29 03:59:43 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 4 |
Keywords | 图像检索 contrastive language-image pre-training(CLIP) attribute recognition transformer image retrieval 属性识别 对比语言图像预训练(CLIP) 图像文本匹配 image text matching |
Language | Chinese |
LinkModel | OpenURL |
MergedId | FETCHMERGED-wanfang_journals_dhdxxb_e2024040073 |
ParticipantIDs | wanfang_journals_dhdxxb_e202404007 |
PublicationCentury | 2000 |
PublicationDate | 2024 |
PublicationDateYYYYMMDD | 2024-01-01 |
PublicationDate_xml | – year: 2024 text: 2024 |
PublicationDecade | 2020 |
PublicationTitle | 东华大学学报(英文版) |
PublicationTitle_FL | Journal of Donghua University(English Edition) |
PublicationYear | 2024 |
Publisher | 东华大学信息科学与技术学院,上海 201620 |
Publisher_xml | – name: 东华大学信息科学与技术学院,上海 201620 |
SSID | ssib001166724 ssj0000627409 ssib018830140 ssib040214605 ssib022315852 ssib051367670 ssib006703047 |
Score | 4.721525 |
Snippet | TP391.4; 服装属性识别已成为一项关键技术,使用户能够自动识别服装的特征,并搜索具有相似属性的服装图片.然而,现有方法无法识别新添加的属性,并且可能无法捕获区域级别视觉特征.为解决上述问题,该研究提出一种区域感知时尚对比语言图像预训练(region-aware fashion contrastive... |
SourceID | wanfang |
SourceType | Aggregation Database |
StartPage | 405 |
Title | 面向属性识别和组合检索的区域感知时尚对比学习 |
URI | https://d.wanfangdata.com.cn/periodical/dhdxxb-e202404007 |
Volume | 41 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3NaxNBFB9qetGD-InfFHEOIqn7MTs7c5xNNhRRL1boLWSTrD1F0BRKTz0E6aGHKIJQ0EhFPYhEQYTqv-Mm-l_43tvN7mILVmFZZmd-781v5mV33gwzL4xdE8qN3ba0q1HsiqqItahqV3hVKWK_Ffs-BgDB3RZ35dJ9cWvFW5k7MiztWlrrR4vtjQPPlfyPVSEP7IqnZP_BsrlSyIA02BfuYGG4H8rGPNRc17lxeOhxbXFtYyKwuQ55KHETg_F5qLhpcCWxSEE6ILDDVY2HPg8CrkQmDqUgZVwUxCJBmn2uTYYBkcAQuIEXViEo4ePdeJijQZDqCiwUhATUHmjSDFKkxwBnwAjMN1bZP6ZMxXWNqqtzFRJeUENywTwBBAzVC1XUiL_iKsAeQCZQ6iM3pakoxeQLH1QSIhzJGh5QDcAocAuI4oGHF3Q00IKuSWlpU14tcYp10sPxBwx0F5FUNnYLUAFAar4Co0ic2ogWkagWwTkGrK_RRk5thjcIQ8r-DfC5pGOVBh3p44JAlpWNSmk4sOztE6UhRlheyVsR6WHYfQOhVvADoJFwpnwRu8PyLOuPyOPky3RWO-vrUbNLGEHhGeYdmHg5FTZv6ndu3ytcbFuCxuIbLnHIKKaktlI0aZ89g_dpe-n5fHoW9E_yxRjjpSEDrXxdFMNmC9qZlTPPDvZjk24e2CA6cNeLW70HJd9w-QQ7nk3qFkz6hp5kcxurp9ixUqjP0yz49XI3GT5NPr-abL7_OX6SbH1Inm1Pvw-S4dbkzeb0y-50Z5Bsf0tGo8lgNB29nbz4mnzaScZ7k_Hz5OO7H3uvz7DrjXC5tlTNWDSzb8Tj5r5-dc-ySu9hr3uOLUjpeMqKVMvr2hh4T2sVtTB8U7vjWF4nPs-u_l3fhcOALrKjmE5XDi-xSv_RWvcy-NL96Epm3t_HqpXo |
linkProvider | EBSCOhost |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=%E9%9D%A2%E5%90%91%E5%B1%9E%E6%80%A7%E8%AF%86%E5%88%AB%E5%92%8C%E7%BB%84%E5%90%88%E6%A3%80%E7%B4%A2%E7%9A%84%E5%8C%BA%E5%9F%9F%E6%84%9F%E7%9F%A5%E6%97%B6%E5%B0%9A%E5%AF%B9%E6%AF%94%E5%AD%A6%E4%B9%A0&rft.jtitle=%E4%B8%9C%E5%8D%8E%E5%A4%A7%E5%AD%A6%E5%AD%A6%E6%8A%A5%EF%BC%88%E8%8B%B1%E6%96%87%E7%89%88%EF%BC%89&rft.au=%E7%8E%8B%E5%BA%B7%E5%B9%B3&rft.au=%E8%B5%B5%E9%B8%A3%E5%8D%9A&rft.date=2024&rft.pub=%E4%B8%9C%E5%8D%8E%E5%A4%A7%E5%AD%A6%E4%BF%A1%E6%81%AF%E7%A7%91%E5%AD%A6%E4%B8%8E%E6%8A%80%E6%9C%AF%E5%AD%A6%E9%99%A2%2C%E4%B8%8A%E6%B5%B7+201620&rft.issn=1672-5220&rft.volume=41&rft.issue=4&rft.spage=405&rft.epage=415&rft_id=info:doi/10.19884%2Fj.1672-5220.202405006&rft.externalDocID=dhdxxb_e202404007 |
thumbnail_s | http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=http%3A%2F%2Fwww.wanfangdata.com.cn%2Fimages%2FPeriodicalImages%2Fdhdxxb-e%2Fdhdxxb-e.jpg |