HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing
The inversion of real images into StyleGAN's latent space is a well-studied problem. Nevertheless, applying existing approaches to real-world scenarios remains an open challenge, due to an inherent trade-off between reconstruction and editability: latent space regions which can accurately repre...
Saved in:
Published in | Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) pp. 18490 - 18500 |
---|---|
Main Authors | , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.06.2022
|
Subjects | |
Online Access | Get full text |
ISSN | 1063-6919 |
DOI | 10.1109/CVPR52688.2022.01796 |
Cover
Loading…
Abstract | The inversion of real images into StyleGAN's latent space is a well-studied problem. Nevertheless, applying existing approaches to real-world scenarios remains an open challenge, due to an inherent trade-off between reconstruction and editability: latent space regions which can accurately represent real images typically suffer from degraded semantic control. Recent work proposes to mitigate this trade-off by fine-tuning the generator to add the target image to well-behaved, editable regions of the latent space. While promising, this fine-tuning scheme is impractical for prevalent use as it requires a lengthy training phase for each new image. In this work, we introduce this approach into the realm of encoder-based inversion. We propose HyperStyle, a hypernetwork that learns to modulate StyleGAN's weights to faithfully express a given image in editable regions of the latent space. A naive modulation approach would require training a hypernetwork with over three billion parameters. Through careful network design, we reduce this to be in line with existing encoders. HyperStyle yields reconstructions comparable to those of optimization techniques with the near real-time inference capabilities of encoders. Lastly, we demonstrate HyperStyle's effectiveness on several applications beyond the inversion task, including the editing of out-of-domain images which were never seen during training. Code is available on our project page: https://yuval-alaluf.github.io/hyperstyle/. |
---|---|
AbstractList | The inversion of real images into StyleGAN's latent space is a well-studied problem. Nevertheless, applying existing approaches to real-world scenarios remains an open challenge, due to an inherent trade-off between reconstruction and editability: latent space regions which can accurately represent real images typically suffer from degraded semantic control. Recent work proposes to mitigate this trade-off by fine-tuning the generator to add the target image to well-behaved, editable regions of the latent space. While promising, this fine-tuning scheme is impractical for prevalent use as it requires a lengthy training phase for each new image. In this work, we introduce this approach into the realm of encoder-based inversion. We propose HyperStyle, a hypernetwork that learns to modulate StyleGAN's weights to faithfully express a given image in editable regions of the latent space. A naive modulation approach would require training a hypernetwork with over three billion parameters. Through careful network design, we reduce this to be in line with existing encoders. HyperStyle yields reconstructions comparable to those of optimization techniques with the near real-time inference capabilities of encoders. Lastly, we demonstrate HyperStyle's effectiveness on several applications beyond the inversion task, including the editing of out-of-domain images which were never seen during training. Code is available on our project page: https://yuval-alaluf.github.io/hyperstyle/. |
Author | Alaluf, Yuval Tov, Omer Bermano, Amit Mokady, Ron Gal, Rinon |
Author_xml | – sequence: 1 givenname: Yuval surname: Alaluf fullname: Alaluf, Yuval organization: Blavatnik School of Computer Science, Tel Aviv University – sequence: 2 givenname: Omer surname: Tov fullname: Tov, Omer organization: Blavatnik School of Computer Science, Tel Aviv University – sequence: 3 givenname: Ron surname: Mokady fullname: Mokady, Ron organization: Blavatnik School of Computer Science, Tel Aviv University – sequence: 4 givenname: Rinon surname: Gal fullname: Gal, Rinon organization: Blavatnik School of Computer Science, Tel Aviv University – sequence: 5 givenname: Amit surname: Bermano fullname: Bermano, Amit organization: Blavatnik School of Computer Science, Tel Aviv University |
BookMark | eNotjN1OwjAYQKvRRECeQC_6Apv92q3t5x1ZkC0haPDnlrSuxerYyLZA9vYS9OpcnJMzJld1UztC7oHFAAwfso-Xdcql1jFnnMcMFMoLMgYp00RiIsUlGQGTIpIIeEOmXffNGBMcQKIekTwf9q597YfKPdIzFrMVLeqDa7vQ1PQY-i96blauPzbtT0d909K1MxUtdmbr6LwMfai3t-Tam6pz039OyPvT_C3Lo-XzoshmyygA6D6yqQWLHi133LNSaqmZEsZzoZSHTzQejRZwEqrk-pSptESOzFjFE2-9mJC7v29wzm32bdiZdtigVphCIn4BHL1Ong |
CODEN | IEEPAD |
ContentType | Conference Proceeding |
DBID | 6IE 6IH CBEJK RIE RIO |
DOI | 10.1109/CVPR52688.2022.01796 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Applied Sciences |
EISBN | 1665469463 9781665469463 |
EISSN | 1063-6919 |
EndPage | 18500 |
ExternalDocumentID | 9879514 |
Genre | orig-research |
GroupedDBID | 6IE 6IH 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP OCL RIE RIL RIO |
ID | FETCH-LOGICAL-i118t-b5b1b9f9b2e2f0d6868073af2377f1c9af9a831d687d289b275d9290ab724fbf3 |
IEDL.DBID | RIE |
IngestDate | Wed Aug 27 02:15:10 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i118t-b5b1b9f9b2e2f0d6868073af2377f1c9af9a831d687d289b275d9290ab724fbf3 |
PageCount | 11 |
ParticipantIDs | ieee_primary_9879514 |
PublicationCentury | 2000 |
PublicationDate | 2022-June |
PublicationDateYYYYMMDD | 2022-06-01 |
PublicationDate_xml | – month: 06 year: 2022 text: 2022-June |
PublicationDecade | 2020 |
PublicationTitle | Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) |
PublicationTitleAbbrev | CVPR |
PublicationYear | 2022 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0003211698 |
Score | 2.6227984 |
Snippet | The inversion of real images into StyleGAN's latent space is a well-studied problem. Nevertheless, applying existing approaches to real-world scenarios remains... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 18490 |
SubjectTerms | Aerospace electronics Generators Pattern recognition Real-time systems Robustness Semantics Training Vision + graphics; Vision applications and systems |
Title | HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing |
URI | https://ieeexplore.ieee.org/document/9879514 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwELVKJ6YCLeJbHhhJmtiJHbOhqiUgtaoKRd2q-EuqgBS16QC_nnMSikAMTIkSS4l8du69y7s7hC4Dw0ikIuaFSgBByWjmZWGkPBNrRkNmbUBdgvNwxNJpdD-LZw10tc2FMcaU4jPju9PyX75eqo0LlXWF64ztulbvAHGrcrW28RQKTIaJpM6OCwPR7T2NJ66YiRNwEeK7pcd-9FApXcighYZfD6-UI8_-ppC--vhVl_G_b7eHOt_Jeni8dUP7qGHyA9Sq0SWu9-66jdIUKOfqoXh_Mde4PNzejLCrs1FGzLCLyOJyzKiShq8xAFo8ASSJ717hs4P7euFE0h00HfQfe6lX91HwFkAfCk_GMpTCCkkMsYFmCUtgY2eWUM4t2CizIktoCDe4Bv4lCY81oKYgk5xEVlp6iJr5MjdHCAPIBcBoqbJGRLHiUiljqeacaRmokB-jtpuY-VtVKmNez8nJ35dP0a4zTaW8OkPNYrUx5-DjC3lRGvcTL4GnPA |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NT8IwFG8IHvTkBxi_7cGjG2u7tdSbIeBQWAiC4UbWr4SoYGAc9K-33SZG48HTlq3Jlr527_fefr_3ALgKNMWhDKmHJLcBSkpSL0Wh9HSkKEHUmIA4gXM_ofE4vJ9Ekwq43mhhtNY5-Uz77jT_l68Wcu1SZQ3uOmO7rtVb1u9HqFBrbTIqxMYylDdLfRwKeKP1NBi6ciaOwoWx7xYf_dFFJXcinV3Q_3p8wR159teZ8OXHr8qM_32_PVD_luvBwcYR7YOKnh-A3RJfwnL3rmogjm3QuXzM3l_0DcwPd7cJdJU28pwZdDlZmI9JCnL4ClpIC4cWS8Luq_3wwLaaOZp0HYw77VEr9spOCt7MBhCZJyKBBDdcYI1NoGiTNu3WTg0mjBlrpdTwtEmQvcGUjcAEZpGyuClIBcOhEYYcgup8MddHAFqYayGjIdJoHkaSCSm1IYoxqkQgETsGNTcx07eiWMa0nJOTvy9fgu141O9Ne93k4RTsODMVPKwzUM2Wa31uPX4mLnJDfwIZN6qF |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=HyperStyle%3A+StyleGAN+Inversion+with+HyperNetworks+for+Real+Image+Editing&rft.au=Alaluf%2C+Yuval&rft.au=Tov%2C+Omer&rft.au=Mokady%2C+Ron&rft.au=Gal%2C+Rinon&rft.date=2022-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=18490&rft.epage=18500&rft_id=info:doi/10.1109%2FCVPR52688.2022.01796&rft.externalDocID=9879514 |