HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing

The inversion of real images into StyleGAN's latent space is a well-studied problem. Nevertheless, applying existing approaches to real-world scenarios remains an open challenge, due to an inherent trade-off between reconstruction and editability: latent space regions which can accurately repre...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) pp. 18490 - 18500
Main Authors	Alaluf, Yuval, Tov, Omer, Mokady, Ron, Gal, Rinon, Bermano, Amit
Format	Conference Proceeding
Language	English
Published	IEEE 01.06.2022
Subjects	Aerospace electronics Generators Pattern recognition Real-time systems Robustness Semantics Training Vision + graphics; Vision applications and systems
Online Access	Get full text
ISSN	1063-6919
DOI	10.1109/CVPR52688.2022.01796

Cover

Loading…

Abstract	The inversion of real images into StyleGAN's latent space is a well-studied problem. Nevertheless, applying existing approaches to real-world scenarios remains an open challenge, due to an inherent trade-off between reconstruction and editability: latent space regions which can accurately represent real images typically suffer from degraded semantic control. Recent work proposes to mitigate this trade-off by fine-tuning the generator to add the target image to well-behaved, editable regions of the latent space. While promising, this fine-tuning scheme is impractical for prevalent use as it requires a lengthy training phase for each new image. In this work, we introduce this approach into the realm of encoder-based inversion. We propose HyperStyle, a hypernetwork that learns to modulate StyleGAN's weights to faithfully express a given image in editable regions of the latent space. A naive modulation approach would require training a hypernetwork with over three billion parameters. Through careful network design, we reduce this to be in line with existing encoders. HyperStyle yields reconstructions comparable to those of optimization techniques with the near real-time inference capabilities of encoders. Lastly, we demonstrate HyperStyle's effectiveness on several applications beyond the inversion task, including the editing of out-of-domain images which were never seen during training. Code is available on our project page: https://yuval-alaluf.github.io/hyperstyle/.
AbstractList	The inversion of real images into StyleGAN's latent space is a well-studied problem. Nevertheless, applying existing approaches to real-world scenarios remains an open challenge, due to an inherent trade-off between reconstruction and editability: latent space regions which can accurately represent real images typically suffer from degraded semantic control. Recent work proposes to mitigate this trade-off by fine-tuning the generator to add the target image to well-behaved, editable regions of the latent space. While promising, this fine-tuning scheme is impractical for prevalent use as it requires a lengthy training phase for each new image. In this work, we introduce this approach into the realm of encoder-based inversion. We propose HyperStyle, a hypernetwork that learns to modulate StyleGAN's weights to faithfully express a given image in editable regions of the latent space. A naive modulation approach would require training a hypernetwork with over three billion parameters. Through careful network design, we reduce this to be in line with existing encoders. HyperStyle yields reconstructions comparable to those of optimization techniques with the near real-time inference capabilities of encoders. Lastly, we demonstrate HyperStyle's effectiveness on several applications beyond the inversion task, including the editing of out-of-domain images which were never seen during training. Code is available on our project page: https://yuval-alaluf.github.io/hyperstyle/.
Author	Alaluf, Yuval Tov, Omer Bermano, Amit Mokady, Ron Gal, Rinon
Author_xml	– sequence: 1 givenname: Yuval surname: Alaluf fullname: Alaluf, Yuval organization: Blavatnik School of Computer Science, Tel Aviv University – sequence: 2 givenname: Omer surname: Tov fullname: Tov, Omer organization: Blavatnik School of Computer Science, Tel Aviv University – sequence: 3 givenname: Ron surname: Mokady fullname: Mokady, Ron organization: Blavatnik School of Computer Science, Tel Aviv University – sequence: 4 givenname: Rinon surname: Gal fullname: Gal, Rinon organization: Blavatnik School of Computer Science, Tel Aviv University – sequence: 5 givenname: Amit surname: Bermano fullname: Bermano, Amit organization: Blavatnik School of Computer Science, Tel Aviv University
BookMark	eNotjN1OwjAYQKvRRECeQC_6Apv92q3t5x1ZkC0haPDnlrSuxerYyLZA9vYS9OpcnJMzJld1UztC7oHFAAwfso-Xdcql1jFnnMcMFMoLMgYp00RiIsUlGQGTIpIIeEOmXffNGBMcQKIekTwf9q597YfKPdIzFrMVLeqDa7vQ1PQY-i96blauPzbtT0d909K1MxUtdmbr6LwMfai3t-Tam6pz039OyPvT_C3Lo-XzoshmyygA6D6yqQWLHi133LNSaqmZEsZzoZSHTzQejRZwEqrk-pSptESOzFjFE2-9mJC7v29wzm32bdiZdtigVphCIn4BHL1Ong
CODEN	IEEPAD
ContentType	Conference Proceeding
DBID	6IE 6IH CBEJK RIE RIO
DOI	10.1109/CVPR52688.2022.01796
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Applied Sciences
EISBN	1665469463 9781665469463
EISSN	1063-6919
EndPage	18500
ExternalDocumentID	9879514
Genre	orig-research
GroupedDBID	6IE 6IH 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP OCL RIE RIL RIO
ID	FETCH-LOGICAL-i118t-b5b1b9f9b2e2f0d6868073af2377f1c9af9a831d687d289b275d9290ab724fbf3
IEDL.DBID	RIE
IngestDate	Wed Aug 27 02:15:10 EDT 2025
IsPeerReviewed	false
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i118t-b5b1b9f9b2e2f0d6868073af2377f1c9af9a831d687d289b275d9290ab724fbf3
PageCount	11
ParticipantIDs	ieee_primary_9879514
PublicationCentury	2000
PublicationDate	2022-June
PublicationDateYYYYMMDD	2022-06-01
PublicationDate_xml	– month: 06 year: 2022 text: 2022-June
PublicationDecade	2020
PublicationTitle	Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online)
PublicationTitleAbbrev	CVPR
PublicationYear	2022
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0003211698
Score	2.6227984
Snippet	The inversion of real images into StyleGAN's latent space is a well-studied problem. Nevertheless, applying existing approaches to real-world scenarios remains...
SourceID	ieee
SourceType	Publisher
StartPage	18490
SubjectTerms	Aerospace electronics Generators Pattern recognition Real-time systems Robustness Semantics Training Vision + graphics; Vision applications and systems
Title	HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing
URI	https://ieeexplore.ieee.org/document/9879514
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwELVKJ6YCLeJbHhhJmtiJHbOhqiUgtaoKRd2q-EuqgBS16QC_nnMSikAMTIkSS4l8du69y7s7hC4Dw0ikIuaFSgBByWjmZWGkPBNrRkNmbUBdgvNwxNJpdD-LZw10tc2FMcaU4jPju9PyX75eqo0LlXWF64ztulbvAHGrcrW28RQKTIaJpM6OCwPR7T2NJ66YiRNwEeK7pcd-9FApXcighYZfD6-UI8_-ppC--vhVl_G_b7eHOt_Jeni8dUP7qGHyA9Sq0SWu9-66jdIUKOfqoXh_Mde4PNzejLCrs1FGzLCLyOJyzKiShq8xAFo8ASSJ717hs4P7euFE0h00HfQfe6lX91HwFkAfCk_GMpTCCkkMsYFmCUtgY2eWUM4t2CizIktoCDe4Bv4lCY81oKYgk5xEVlp6iJr5MjdHCAPIBcBoqbJGRLHiUiljqeacaRmokB-jtpuY-VtVKmNez8nJ35dP0a4zTaW8OkPNYrUx5-DjC3lRGvcTL4GnPA
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NT8IwFG8IHvTkBxi_7cGjG2u7tdSbIeBQWAiC4UbWr4SoYGAc9K-33SZG48HTlq3Jlr527_fefr_3ALgKNMWhDKmHJLcBSkpSL0Wh9HSkKEHUmIA4gXM_ofE4vJ9Ekwq43mhhtNY5-Uz77jT_l68Wcu1SZQ3uOmO7rtVb1u9HqFBrbTIqxMYylDdLfRwKeKP1NBi6ciaOwoWx7xYf_dFFJXcinV3Q_3p8wR159teZ8OXHr8qM_32_PVD_luvBwcYR7YOKnh-A3RJfwnL3rmogjm3QuXzM3l_0DcwPd7cJdJU28pwZdDlZmI9JCnL4ClpIC4cWS8Luq_3wwLaaOZp0HYw77VEr9spOCt7MBhCZJyKBBDdcYI1NoGiTNu3WTg0mjBlrpdTwtEmQvcGUjcAEZpGyuClIBcOhEYYcgup8MddHAFqYayGjIdJoHkaSCSm1IYoxqkQgETsGNTcx07eiWMa0nJOTvy9fgu141O9Ne93k4RTsODMVPKwzUM2Wa31uPX4mLnJDfwIZN6qF
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=HyperStyle%3A+StyleGAN+Inversion+with+HyperNetworks+for+Real+Image+Editing&rft.au=Alaluf%2C+Yuval&rft.au=Tov%2C+Omer&rft.au=Mokady%2C+Ron&rft.au=Gal%2C+Rinon&rft.date=2022-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=18490&rft.epage=18500&rft_id=info:doi/10.1109%2FCVPR52688.2022.01796&rft.externalDocID=9879514