Multi-Modal Image Super-Resolution via Deep Convolutional Transform Learning
Real-world situations often involve processing data from diverse imaging modalities like Multispectral (MS), Near Infrared (NIR), and RGB, each capturing different aspects of the same scene. These modalities often vary in spatial and spectral resolution. Hence, Multi-modal Image Super-Resolution (MI...
Saved in:
Published in | 2024 32nd European Signal Processing Conference (EUSIPCO) pp. 671 - 675 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
European Association for Signal Processing - EURASIP
26.08.2024
|
Subjects | |
Online Access | Get full text |
ISSN | 2076-1465 |
DOI | 10.23919/EUSIPCO63174.2024.10715007 |
Cover
Summary: | Real-world situations often involve processing data from diverse imaging modalities like Multispectral (MS), Near Infrared (NIR), and RGB, each capturing different aspects of the same scene. These modalities often vary in spatial and spectral resolution. Hence, Multi-modal Image Super-Resolution (MISR) techniques are required to improve the spatial/spectral resolution of target modality, taking help from High Resolution (HR) guidance modality that shares common features like textures, edges, and other structures. Traditional MISR approaches using Convolutional Neural Networks (CNNs) typically employ an encoder-decoder architecture, which is prone to overfit in data-limited scenarios. This work proposes a novel deep convolutional analysis sparse coding method, utilizing convolutional transforms within a fusion framework that eliminates the need for a decoder network. Thus, reducing the trainable parameters and enhancing the suitability for data-limited application scenarios. A joint optimization framework is proposed, which learns deep convolutional transforms for Low Resolution (LR) images of the target modality and HR images of the guidance modality, along with a fusion transform that combines these transform features to reconstruct HR images of the target modality. In contrast to dictionary-based synthesis sparse coding methods for MISR, the proposed approach offers improved performance with reduced complexity, leveraging the inherent advantages of transform learning. The efficacy of the proposed method is demonstrated using RGB-NIR and RGB-MS datasets, showing superior reconstruction performance compared to state-of-the-art techniques without introducing additional artifacts from the guidance image. |
---|---|
ISSN: | 2076-1465 |
DOI: | 10.23919/EUSIPCO63174.2024.10715007 |