HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing

The inversion of real images into StyleGAN's latent space is a well-studied problem. Nevertheless, applying existing approaches to real-world scenarios remains an open challenge, due to an inherent trade-off between reconstruction and editability: latent space regions which can accurately repre...

Full description

Saved in:
Bibliographic Details
Published inProceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) pp. 18490 - 18500
Main Authors Alaluf, Yuval, Tov, Omer, Mokady, Ron, Gal, Rinon, Bermano, Amit
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2022
Subjects
Online AccessGet full text
ISSN1063-6919
DOI10.1109/CVPR52688.2022.01796

Cover

Loading…
Abstract The inversion of real images into StyleGAN's latent space is a well-studied problem. Nevertheless, applying existing approaches to real-world scenarios remains an open challenge, due to an inherent trade-off between reconstruction and editability: latent space regions which can accurately represent real images typically suffer from degraded semantic control. Recent work proposes to mitigate this trade-off by fine-tuning the generator to add the target image to well-behaved, editable regions of the latent space. While promising, this fine-tuning scheme is impractical for prevalent use as it requires a lengthy training phase for each new image. In this work, we introduce this approach into the realm of encoder-based inversion. We propose HyperStyle, a hypernetwork that learns to modulate StyleGAN's weights to faithfully express a given image in editable regions of the latent space. A naive modulation approach would require training a hypernetwork with over three billion parameters. Through careful network design, we reduce this to be in line with existing encoders. HyperStyle yields reconstructions comparable to those of optimization techniques with the near real-time inference capabilities of encoders. Lastly, we demonstrate HyperStyle's effectiveness on several applications beyond the inversion task, including the editing of out-of-domain images which were never seen during training. Code is available on our project page: https://yuval-alaluf.github.io/hyperstyle/.
AbstractList The inversion of real images into StyleGAN's latent space is a well-studied problem. Nevertheless, applying existing approaches to real-world scenarios remains an open challenge, due to an inherent trade-off between reconstruction and editability: latent space regions which can accurately represent real images typically suffer from degraded semantic control. Recent work proposes to mitigate this trade-off by fine-tuning the generator to add the target image to well-behaved, editable regions of the latent space. While promising, this fine-tuning scheme is impractical for prevalent use as it requires a lengthy training phase for each new image. In this work, we introduce this approach into the realm of encoder-based inversion. We propose HyperStyle, a hypernetwork that learns to modulate StyleGAN's weights to faithfully express a given image in editable regions of the latent space. A naive modulation approach would require training a hypernetwork with over three billion parameters. Through careful network design, we reduce this to be in line with existing encoders. HyperStyle yields reconstructions comparable to those of optimization techniques with the near real-time inference capabilities of encoders. Lastly, we demonstrate HyperStyle's effectiveness on several applications beyond the inversion task, including the editing of out-of-domain images which were never seen during training. Code is available on our project page: https://yuval-alaluf.github.io/hyperstyle/.
Author Alaluf, Yuval
Tov, Omer
Bermano, Amit
Mokady, Ron
Gal, Rinon
Author_xml – sequence: 1
  givenname: Yuval
  surname: Alaluf
  fullname: Alaluf, Yuval
  organization: Blavatnik School of Computer Science, Tel Aviv University
– sequence: 2
  givenname: Omer
  surname: Tov
  fullname: Tov, Omer
  organization: Blavatnik School of Computer Science, Tel Aviv University
– sequence: 3
  givenname: Ron
  surname: Mokady
  fullname: Mokady, Ron
  organization: Blavatnik School of Computer Science, Tel Aviv University
– sequence: 4
  givenname: Rinon
  surname: Gal
  fullname: Gal, Rinon
  organization: Blavatnik School of Computer Science, Tel Aviv University
– sequence: 5
  givenname: Amit
  surname: Bermano
  fullname: Bermano, Amit
  organization: Blavatnik School of Computer Science, Tel Aviv University
BookMark eNotjN1OwjAYQKvRRECeQC_6Apv92q3t5x1ZkC0haPDnlrSuxerYyLZA9vYS9OpcnJMzJld1UztC7oHFAAwfso-Xdcql1jFnnMcMFMoLMgYp00RiIsUlGQGTIpIIeEOmXffNGBMcQKIekTwf9q597YfKPdIzFrMVLeqDa7vQ1PQY-i96blauPzbtT0d909K1MxUtdmbr6LwMfai3t-Tam6pz039OyPvT_C3Lo-XzoshmyygA6D6yqQWLHi133LNSaqmZEsZzoZSHTzQejRZwEqrk-pSptESOzFjFE2-9mJC7v29wzm32bdiZdtigVphCIn4BHL1Ong
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/CVPR52688.2022.01796
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 1665469463
9781665469463
EISSN 1063-6919
EndPage 18500
ExternalDocumentID 9879514
Genre orig-research
GroupedDBID 6IE
6IH
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
OCL
RIE
RIL
RIO
ID FETCH-LOGICAL-i118t-b5b1b9f9b2e2f0d6868073af2377f1c9af9a831d687d289b275d9290ab724fbf3
IEDL.DBID RIE
IngestDate Wed Aug 27 02:15:10 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i118t-b5b1b9f9b2e2f0d6868073af2377f1c9af9a831d687d289b275d9290ab724fbf3
PageCount 11
ParticipantIDs ieee_primary_9879514
PublicationCentury 2000
PublicationDate 2022-June
PublicationDateYYYYMMDD 2022-06-01
PublicationDate_xml – month: 06
  year: 2022
  text: 2022-June
PublicationDecade 2020
PublicationTitle Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online)
PublicationTitleAbbrev CVPR
PublicationYear 2022
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003211698
Score 2.6227984
Snippet The inversion of real images into StyleGAN's latent space is a well-studied problem. Nevertheless, applying existing approaches to real-world scenarios remains...
SourceID ieee
SourceType Publisher
StartPage 18490
SubjectTerms Aerospace electronics
Generators
Pattern recognition
Real-time systems
Robustness
Semantics
Training
Vision + graphics; Vision applications and systems
Title HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing
URI https://ieeexplore.ieee.org/document/9879514
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwELVKJ6YCLeJbHhhJmtiJHbOhqiUgtaoKRd2q-EuqgBS16QC_nnMSikAMTIkSS4l8du69y7s7hC4Dw0ikIuaFSgBByWjmZWGkPBNrRkNmbUBdgvNwxNJpdD-LZw10tc2FMcaU4jPju9PyX75eqo0LlXWF64ztulbvAHGrcrW28RQKTIaJpM6OCwPR7T2NJ66YiRNwEeK7pcd-9FApXcighYZfD6-UI8_-ppC--vhVl_G_b7eHOt_Jeni8dUP7qGHyA9Sq0SWu9-66jdIUKOfqoXh_Mde4PNzejLCrs1FGzLCLyOJyzKiShq8xAFo8ASSJ717hs4P7euFE0h00HfQfe6lX91HwFkAfCk_GMpTCCkkMsYFmCUtgY2eWUM4t2CizIktoCDe4Bv4lCY81oKYgk5xEVlp6iJr5MjdHCAPIBcBoqbJGRLHiUiljqeacaRmokB-jtpuY-VtVKmNez8nJ35dP0a4zTaW8OkPNYrUx5-DjC3lRGvcTL4GnPA
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NT8IwFG8IHvTkBxi_7cGjG2u7tdSbIeBQWAiC4UbWr4SoYGAc9K-33SZG48HTlq3Jlr527_fefr_3ALgKNMWhDKmHJLcBSkpSL0Wh9HSkKEHUmIA4gXM_ofE4vJ9Ekwq43mhhtNY5-Uz77jT_l68Wcu1SZQ3uOmO7rtVb1u9HqFBrbTIqxMYylDdLfRwKeKP1NBi6ciaOwoWx7xYf_dFFJXcinV3Q_3p8wR159teZ8OXHr8qM_32_PVD_luvBwcYR7YOKnh-A3RJfwnL3rmogjm3QuXzM3l_0DcwPd7cJdJU28pwZdDlZmI9JCnL4ClpIC4cWS8Luq_3wwLaaOZp0HYw77VEr9spOCt7MBhCZJyKBBDdcYI1NoGiTNu3WTg0mjBlrpdTwtEmQvcGUjcAEZpGyuClIBcOhEYYcgup8MddHAFqYayGjIdJoHkaSCSm1IYoxqkQgETsGNTcx07eiWMa0nJOTvy9fgu141O9Ne93k4RTsODMVPKwzUM2Wa31uPX4mLnJDfwIZN6qF
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=HyperStyle%3A+StyleGAN+Inversion+with+HyperNetworks+for+Real+Image+Editing&rft.au=Alaluf%2C+Yuval&rft.au=Tov%2C+Omer&rft.au=Mokady%2C+Ron&rft.au=Gal%2C+Rinon&rft.date=2022-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=18490&rft.epage=18500&rft_id=info:doi/10.1109%2FCVPR52688.2022.01796&rft.externalDocID=9879514