TAGE: Trustworthy Attribute Group Editing for Stable Few-shot Image Generation
Generative Adversarial Networks (GANs) have emerged as a prominent research focus for image editing tasks, leveraging the powerful image generation capabilities of the GAN framework to produce remarkable results.However, prevailing approaches are contingent upon extensive training datasets and expli...
Saved in:
Main Authors | , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
23.10.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Generative Adversarial Networks (GANs) have emerged as a prominent research
focus for image editing tasks, leveraging the powerful image generation
capabilities of the GAN framework to produce remarkable results.However,
prevailing approaches are contingent upon extensive training datasets and
explicit supervision, presenting a significant challenge in manipulating the
diverse attributes of new image classes with limited sample availability. To
surmount this hurdle, we introduce TAGE, an innovative image generation network
comprising three integral modules: the Codebook Learning Module (CLM), the Code
Prediction Module (CPM) and the Prompt-driven Semantic Module (PSM). The CPM
module delves into the semantic dimensions of category-agnostic attributes,
encapsulating them within a discrete codebook. This module is predicated on the
concept that images are assemblages of attributes, and thus, by editing these
category-independent attributes, it is theoretically possible to generate
images from unseen categories. Subsequently, the CPM module facilitates
naturalistic image editing by predicting indices of category-independent
attribute vectors within the codebook. Additionally, the PSM module generates
semantic cues that are seamlessly integrated into the Transformer architecture
of the CPM, enhancing the model's comprehension of the targeted attributes for
editing. With these semantic cues, the model can generate images that
accentuate desired attributes more prominently while maintaining the integrity
of the original category, even with a limited number of samples. We have
conducted extensive experiments utilizing the Animal Faces, Flowers, and
VGGFaces datasets. The results of these experiments demonstrate that our
proposed method not only achieves superior performance but also exhibits a high
degree of stability when compared to other few-shot image generation
techniques. |
---|---|
DOI: | 10.48550/arxiv.2410.17855 |