Multi-Modal Image Super-Resolution via Deep Convolutional Transform Learning

Real-world situations often involve processing data from diverse imaging modalities like Multispectral (MS), Near Infrared (NIR), and RGB, each capturing different aspects of the same scene. These modalities often vary in spatial and spectral resolution. Hence, Multi-modal Image Super-Resolution (MI...

Full description

Saved in:

Bibliographic Details
Published in	2024 32nd European Signal Processing Conference (EUSIPCO) pp. 671 - 675
Main Authors	Kumar, Kriti, Majumdar, Angshul, Kumar, A Anil, Chandra, M Girish
Format	Conference Proceeding
Language	English
Published	European Association for Signal Processing - EURASIP 26.08.2024
Subjects	Convolution Convolutional codes Convolutional sparse coding Convolutional transform learning Deep models Encoding Image reconstruction Imaging Joint optimization Multimodal image super-resolution Optimization Spatial resolution Superresolution Training data Transforms
Online Access	Get full text
ISSN	2076-1465
DOI	10.23919/EUSIPCO63174.2024.10715007

Cover

Loading…

More Information
Summary:	Real-world situations often involve processing data from diverse imaging modalities like Multispectral (MS), Near Infrared (NIR), and RGB, each capturing different aspects of the same scene. These modalities often vary in spatial and spectral resolution. Hence, Multi-modal Image Super-Resolution (MISR) techniques are required to improve the spatial/spectral resolution of target modality, taking help from High Resolution (HR) guidance modality that shares common features like textures, edges, and other structures. Traditional MISR approaches using Convolutional Neural Networks (CNNs) typically employ an encoder-decoder architecture, which is prone to overfit in data-limited scenarios. This work proposes a novel deep convolutional analysis sparse coding method, utilizing convolutional transforms within a fusion framework that eliminates the need for a decoder network. Thus, reducing the trainable parameters and enhancing the suitability for data-limited application scenarios. A joint optimization framework is proposed, which learns deep convolutional transforms for Low Resolution (LR) images of the target modality and HR images of the guidance modality, along with a fusion transform that combines these transform features to reconstruct HR images of the target modality. In contrast to dictionary-based synthesis sparse coding methods for MISR, the proposed approach offers improved performance with reduced complexity, leveraging the inherent advantages of transform learning. The efficacy of the proposed method is demonstrated using RGB-NIR and RGB-MS datasets, showing superior reconstruction performance compared to state-of-the-art techniques without introducing additional artifacts from the guidance image.
ISSN:	2076-1465
DOI:	10.23919/EUSIPCO63174.2024.10715007