Clothing Parsing Based on Multi-Scale Fusion and Improved Self-Attention Mechanism

TP391.4; Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.This paper presents a convolutional structure with multi-scale fusion to opt...

Full description

Saved in:

Bibliographic Details
Published in	东华大学学报（英文版） Vol. 40; no. 6; pp. 661 - 666
Main Authors	CHEN Nuo, WANG Shaoyu, LU Ran, LI Wenxuan, QIN Zhidong, SHI Xiujin
Format	Journal Article
Language	English
Published	31.12.2023
Subjects	self-attention mechanism clothing parsing convolutional neural network multi-scale fusion vision Transformer
Online Access	Get full text

Cover

Loading…

More Information
Summary:	TP391.4; Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.This paper presents a convolutional structure with multi-scale fusion to optimize the step of clothing feature extraction and a self-attention module to capture long-range association information.The structure enables the self-attention mechanism to directly participate in the process of information exchange through the down-scaling projection operation of the multi-scale framework.In addition,the improved self-attention module introduces the extraction of 2-dimensional relative position information to make up for its lack of ability to extract spatial position features from clothing images.The experimental results based on the colorful fashion parsing dataset(CFPD)show that the proposed network structure achieves 53.68%mean intersection over union(mIoU)and has better performance on the clothing parsing task.
ISSN:	1672-5220
DOI:	10.19884/j.1672-5220.202303008