ESTGN: Enhanced Self-Mined Text Guided Super-Resolution Network for Superior Image Super Resolution

In this paper, we propose a novel Enhanced Self-mined Text Guided Super-resolution Network (ESTGN) for single image super-resolution (SISR). Unlike preceding methods, ESTGN autonomously mines task-related text from images and uses it to guide SR for high-frequency detail restoration. The proposed me...

Full description

Saved in:

Bibliographic Details
Published in	ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 3655 - 3659
Main Authors	Li, Qipei, Ying, Zefeng, Pan, Da, Fan, Zhaoxin, Shi, Ping
Format	Conference Proceeding
Language	English
Published	IEEE 14.04.2024
Subjects	Acoustics Benchmark testing Data mining Image restoration Image Super-resolution Information retrieval Modal Balance Multi-modal Learning Semantics Superresolution Text-mining Vision-language Correspond
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, we propose a novel Enhanced Self-mined Text Guided Super-resolution Network (ESTGN) for single image super-resolution (SISR). Unlike preceding methods, ESTGN autonomously mines task-related text from images and uses it to guide SR for high-frequency detail restoration. The proposed methods include the Self-mined Text Information Extraction Module, Multi-resolution Text-aware Gradient Balance Module, and Masked Text-conditioned Attention Module. Our method can fully leverage self-mined textual semantic information and enhance gradient propagation in text. We validate our method with extensive experiments on the benchmark dataset, where ESTGN significantly outperforms the baseline model and sets a new state-of-the-art. This work opens up a promising avenue for the integration of text information in image SR tasks.
ISSN:	2379-190X
DOI:	10.1109/ICASSP48485.2024.10448088