Multi-Modal Image Super-Resolution via Deep Convolutional Transform Learning

Real-world situations often involve processing data from diverse imaging modalities like Multispectral (MS), Near Infrared (NIR), and RGB, each capturing different aspects of the same scene. These modalities often vary in spatial and spectral resolution. Hence, Multi-modal Image Super-Resolution (MI...

Full description

Saved in:

Bibliographic Details
Published in	2024 32nd European Signal Processing Conference (EUSIPCO) pp. 671 - 675
Main Authors	Kumar, Kriti, Majumdar, Angshul, Kumar, A Anil, Chandra, M Girish
Format	Conference Proceeding
Language	English
Published	European Association for Signal Processing - EURASIP 26.08.2024
Subjects	Convolution Convolutional codes Convolutional sparse coding Convolutional transform learning Deep models Encoding Image reconstruction Imaging Joint optimization Multimodal image super-resolution Optimization Spatial resolution Superresolution Training data Transforms
Online Access	Get full text
ISSN	2076-1465
DOI	10.23919/EUSIPCO63174.2024.10715007

Cover

Abstract	Real-world situations often involve processing data from diverse imaging modalities like Multispectral (MS), Near Infrared (NIR), and RGB, each capturing different aspects of the same scene. These modalities often vary in spatial and spectral resolution. Hence, Multi-modal Image Super-Resolution (MISR) techniques are required to improve the spatial/spectral resolution of target modality, taking help from High Resolution (HR) guidance modality that shares common features like textures, edges, and other structures. Traditional MISR approaches using Convolutional Neural Networks (CNNs) typically employ an encoder-decoder architecture, which is prone to overfit in data-limited scenarios. This work proposes a novel deep convolutional analysis sparse coding method, utilizing convolutional transforms within a fusion framework that eliminates the need for a decoder network. Thus, reducing the trainable parameters and enhancing the suitability for data-limited application scenarios. A joint optimization framework is proposed, which learns deep convolutional transforms for Low Resolution (LR) images of the target modality and HR images of the guidance modality, along with a fusion transform that combines these transform features to reconstruct HR images of the target modality. In contrast to dictionary-based synthesis sparse coding methods for MISR, the proposed approach offers improved performance with reduced complexity, leveraging the inherent advantages of transform learning. The efficacy of the proposed method is demonstrated using RGB-NIR and RGB-MS datasets, showing superior reconstruction performance compared to state-of-the-art techniques without introducing additional artifacts from the guidance image.
AbstractList	Real-world situations often involve processing data from diverse imaging modalities like Multispectral (MS), Near Infrared (NIR), and RGB, each capturing different aspects of the same scene. These modalities often vary in spatial and spectral resolution. Hence, Multi-modal Image Super-Resolution (MISR) techniques are required to improve the spatial/spectral resolution of target modality, taking help from High Resolution (HR) guidance modality that shares common features like textures, edges, and other structures. Traditional MISR approaches using Convolutional Neural Networks (CNNs) typically employ an encoder-decoder architecture, which is prone to overfit in data-limited scenarios. This work proposes a novel deep convolutional analysis sparse coding method, utilizing convolutional transforms within a fusion framework that eliminates the need for a decoder network. Thus, reducing the trainable parameters and enhancing the suitability for data-limited application scenarios. A joint optimization framework is proposed, which learns deep convolutional transforms for Low Resolution (LR) images of the target modality and HR images of the guidance modality, along with a fusion transform that combines these transform features to reconstruct HR images of the target modality. In contrast to dictionary-based synthesis sparse coding methods for MISR, the proposed approach offers improved performance with reduced complexity, leveraging the inherent advantages of transform learning. The efficacy of the proposed method is demonstrated using RGB-NIR and RGB-MS datasets, showing superior reconstruction performance compared to state-of-the-art techniques without introducing additional artifacts from the guidance image.
Author	Majumdar, Angshul Kumar, Kriti Chandra, M Girish Kumar, A Anil
Author_xml	– sequence: 1 givenname: Kriti surname: Kumar fullname: Kumar, Kriti email: kriti.kumar@tcs.com organization: TCS Research,India – sequence: 2 givenname: Angshul surname: Majumdar fullname: Majumdar, Angshul email: achannaanil.kumar@tcs.com organization: TCS Research,India – sequence: 3 givenname: A Anil surname: Kumar fullname: Kumar, A Anil email: angshul@iiitd.ac.in organization: IIIT Delhi,India – sequence: 4 givenname: M Girish surname: Chandra fullname: Chandra, M Girish email: m.gchandra@tcs.com organization: Institute of Advanced Intelligence, TCG CREST,India
BookMark	eNo1kEtLw0AYRUdRsNb8AxcB14nzfiwlthpIqdh2XSaZb8pAMglJW_DfW7CuLvdwOYv7iO5iHwGhF4Jzygwxr4vdpvwq1pIRxXOKKc8JVkRgrG5QYpQ2XHJhmCTqFs0oVjIjXIoHlExTqDHVWCuC5QxVq1N7DNmqd7ZNy84eIN2cBhizb5j69nQMfUzPwabvAENa9PF8hZf1drRx8v3YpRXYMYZ4eEL33rYTJNeco91ysS0-s2r9URZvVRaIkseMa6y8140XTtSWOGcUY4Q6yqjwjadS2xpM7YHzGhunvRVeMOcundAGUzZHz3_eAAD7YQydHX_2_wewXxpdU9Q
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.23919/EUSIPCO63174.2024.10715007
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EISBN	9789464593617 946459361X
EISSN	2076-1465
EndPage	675
ExternalDocumentID	10715007
Genre	orig-research
GroupedDBID	6IE 6IL ALMA_UNASSIGNED_HOLDINGS CBEJK RIE RIL
ID	FETCH-LOGICAL-i176t-4807ff8cf5d5ba1dd973312d2325fcf268abe9bfe44b09d8fa5f53dde4412c023
IEDL.DBID	RIE
IngestDate	Wed Jan 22 08:32:23 EST 2025
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i176t-4807ff8cf5d5ba1dd973312d2325fcf268abe9bfe44b09d8fa5f53dde4412c023
PageCount	5
ParticipantIDs	ieee_primary_10715007
PublicationCentury	2000
PublicationDate	2024-Aug.-26
PublicationDateYYYYMMDD	2024-08-26
PublicationDate_xml	– month: 08 year: 2024 text: 2024-Aug.-26 day: 26
PublicationDecade	2020
PublicationTitle	2024 32nd European Signal Processing Conference (EUSIPCO)
PublicationTitleAbbrev	EUSIPCO
PublicationYear	2024
Publisher	European Association for Signal Processing - EURASIP
Publisher_xml	– name: European Association for Signal Processing - EURASIP
SSID	ssib028087106 ssib025355106
Score	1.8833982
Snippet	Real-world situations often involve processing data from diverse imaging modalities like Multispectral (MS), Near Infrared (NIR), and RGB, each capturing...
SourceID	ieee
SourceType	Publisher
StartPage	671
SubjectTerms	Convolution Convolutional codes Convolutional sparse coding Convolutional transform learning Deep models Encoding Image reconstruction Imaging Joint optimization Multimodal image super-resolution Optimization Spatial resolution Superresolution Training data Transforms
Title	Multi-Modal Image Super-Resolution via Deep Convolutional Transform Learning
URI	https://ieeexplore.ieee.org/document/10715007
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA7ag3hSseKbgF6zbtPNY8-1pRVbC22ht5LNQ4raLWW3B3-9k334AsFbCNkQktl8M5P5ZhC6Fd63wRKwVLlzJOKRILGgmgC2cNCuRRQqzx0ejnh_Fj3M2bwiqxdcGGttEXxmA98s3vJNqnPvKoM_XID-4rnjuyBnJVmrFh7KADm_PRlSGYItEPI9dFOEO8et-K47mwzGnScOmOndKTQK6hl_1FYpoKV3gEb1osqIkpcgz5JAv__K1_jvVR-i5heLD48_8ekI7djVMXosGLdkmBr1igdvcJvgSb62G-L9-KUU4u1S4Xtr1xgm2ladMHpa67i4Ssr63ESzXnfa6ZOqogJZtgTPiOePOye1Y4YlqmVM7Es2UgNqFXPaUS5VYuPE2ShKwthIp5hjbbgBQWmiGuD9BDVW6cqeIiy5CdvWwKfSGz1StYVmRksVcs2l0Geo6bdisS6TZizqXTj_o_8C7fsT8e5ayi9RI9vk9grwPkuui3P-APW3p7s
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA5SQT2pWPFtQK9Zt9u89lxbWm1roS30VrJ5SFHbUrY9-Oud7MMXCN6WsBtCks0382W-GYRuhec2WAKeKneOUE4FiUWkCWALB-ta0FB57XCvz9tj-jBhk0KsnmlhrLVZ8JkN_GN2l28Weu2pMvjDBdgvXju-DcBPWS7XKrdPxAA7v10aRjIEbyDkO-gmC3iOa_FdczzsDBpPHFDTEyoRDco-f1RXycCltY_65bDymJKXYJ0mgX7_lbHx3-M-QNUvHR8efCLUIdqy8yPUzTS3pLcw6hV33uA8wcP10q6IZ_LzfYg3M4XvrV1i6GhTNMLbo9LKxUVa1ucqGreao0abFDUVyKwmeEq8gtw5qR0zLFE1Y2JftDEyYFgxp13EpUpsnDhLaRLGRjrFHKvDGQhmU6QB4I9RZb6Y2xOEJTdh3Rr4VHq3R6q60MxoqUKuuRT6FFX9VEyXedqMaTkLZ3-0X6Pd9qjXnXY7_cdztOdXx5O3Eb9AlXS1tpeA_mlyla35B-8Jqwg
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2024+32nd+European+Signal+Processing+Conference+%28EUSIPCO%29&rft.atitle=Multi-Modal+Image+Super-Resolution+via+Deep+Convolutional+Transform+Learning&rft.au=Kumar%2C+Kriti&rft.au=Majumdar%2C+Angshul&rft.au=Kumar%2C+A+Anil&rft.au=Chandra%2C+M+Girish&rft.date=2024-08-26&rft.pub=European+Association+for+Signal+Processing+-+EURASIP&rft.eissn=2076-1465&rft.spage=671&rft.epage=675&rft_id=info:doi/10.23919%2FEUSIPCO63174.2024.10715007&rft.externalDocID=10715007