Nighttime Thermal Infrared Image Colorization With Feedback-Based Object Appearance Learning

Stable imaging in adverse environments (e.g., total darkness) makes thermal infrared (TIR) cameras a prevalent option for night scene perception. However, the low contrast and lack of chromaticity of TIR images are detrimental to human interpretation and subsequent deployment of RGB-based vision alg...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on circuits and systems for video technology Vol. 34; no. 6; pp. 4745 - 4761
Main Authors	Luo, Fu-Ya, Liu, Shu-Lin, Cao, Yi-Jun, Yang, Kai-Fu, Xie, Chang-Yong, Liu, Yong, Li, Yong-Jie
Format	Journal Article
Language	English
Published	New York IEEE 01.06.2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Annotations Chromaticity Color imagery Computer vision Consistency Darkness Feedback feedback-based learning Generative adversarial networks Image color analysis Image contrast image-to-image translation Infrared cameras Infrared imagery Learning Meteorology Night nighttime scene perception Occlusion Semantics Task analysis Thermal infrared image colorization Traffic signals Training Weather
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Stable imaging in adverse environments (e.g., total darkness) makes thermal infrared (TIR) cameras a prevalent option for night scene perception. However, the low contrast and lack of chromaticity of TIR images are detrimental to human interpretation and subsequent deployment of RGB-based vision algorithms. Therefore, it makes sense to colorize the nighttime TIR images by translating them into the corresponding daytime color images (NTIR2DC). Despite the impressive progress made in the NTIR2DC task, how to improve the translation performance of small object classes is under-explored. To address this problem, we propose a generative adversarial network incorporating feedback-based object appearance learning (FoalGAN). Specifically, an occlusion-aware mixup module and corresponding appearance consistency loss are proposed to reduce the context dependence of object translation. As a representative example of small objects in nighttime street scenes, we illustrate how to enhance the realism of traffic light by designing a traffic light appearance loss. To further improve the appearance learning of small objects, we devise a dual feedback learning strategy to selectively adjust the learning frequency of different samples. In addition, we provide pixel-level annotation for a subset of the Brno dataset, which can facilitate the research of NTIR image understanding under multiple weather conditions. Extensive experiments illustrate that the proposed FoalGAN is not only effective for appearance learning of small objects, but also outperforms other image translation methods in terms of semantic preservation and edge consistency for the NTIR2DC task. Compared with the state-of-the-art NTIR2DC approach, FoalGAN achieves at least 5.4% improvement in semantic consistency and at least 2% lead in edge consistency.
ISSN:	1051-8215 1558-2205
DOI:	10.1109/TCSVT.2023.3331499