Frequency compensated diffusion model for real-scene dehazing
Due to distribution shift, deep learning based methods for image dehazing suffer from performance degradation when applied to real-world hazy images. In this paper, this study considers a dehazing framework based on conditional diffusion models for improved generalization to real haze. First, our wo...
Saved in:
Published in | Neural networks Vol. 175; p. 106281 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
United States
Elsevier Ltd
01.07.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Due to distribution shift, deep learning based methods for image dehazing suffer from performance degradation when applied to real-world hazy images. In this paper, this study considers a dehazing framework based on conditional diffusion models for improved generalization to real haze. First, our work finds that optimizing the training objective of diffusion models, i.e., Gaussian noise vectors, is non-trivial. The spectral bias of deep networks hinders the higher frequency modes in Gaussian vectors from being learned and hence impairs the reconstruction of image details. To tackle this issue, this study designs a network unit, named Frequency Compensation block (FCB), with a bank of filters that jointly emphasize the mid-to-high frequencies of an input signal. Our work demonstrates that diffusion models with FCB achieve significant gains in both perceptual and distortion metrics. Second, to further boost the generalization performance, this study proposed a novel data synthesis pipeline, HazeAug, to augment haze in terms of degree and diversity. Within the framework, a solid baseline for blind dehazing is set up where models are trained on synthetic hazy-clean pairs, and directly generalize to real data. Extensive evaluations on real dehazing datasets demonstrate the superior performance of the proposed dehazing diffusion model in distortion metrics. Compared to recent methods pre-trained on large-scale, high-quality image datasets, our model achieves a significant PSNR improvement of over 1 dB on challenging databases such as Dense-Haze and Nh-Haze. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0893-6080 1879-2782 |
DOI: | 10.1016/j.neunet.2024.106281 |