HiCAST: Highly Customized Arbitrary Style Transfer with Adapter Enhanced Diffusion Models
The goal of Arbitrary Style Transfer (AST) is injecting the artistic features of a style reference into a given image/video. Existing methods usually focus on pursuing the balance between style and content, whereas ignoring the significant demand for flexible and customized stylization results and t...
Saved in:
Main Authors | , , , , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
11.01.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The goal of Arbitrary Style Transfer (AST) is injecting the artistic features
of a style reference into a given image/video. Existing methods usually focus
on pursuing the balance between style and content, whereas ignoring the
significant demand for flexible and customized stylization results and thereby
limiting their practical application. To address this critical issue, a novel
AST approach namely HiCAST is proposed, which is capable of explicitly
customizing the stylization results according to various source of semantic
clues. In the specific, our model is constructed based on Latent Diffusion
Model (LDM) and elaborately designed to absorb content and style instance as
conditions of LDM. It is characterized by introducing of \textit{Style
Adapter}, which allows user to flexibly manipulate the output results by
aligning multi-level style information and intrinsic knowledge in LDM. Lastly,
we further extend our model to perform video AST. A novel learning objective is
leveraged for video diffusion model training, which significantly improve
cross-frame temporal consistency in the premise of maintaining stylization
strength. Qualitative and quantitative comparisons as well as comprehensive
user studies demonstrate that our HiCAST outperforms the existing SoTA methods
in generating visually plausible stylization results. |
---|---|
DOI: | 10.48550/arxiv.2401.05870 |