Context Encoders: Feature Learning by Inpainting

We present an unsupervised visual feature learning algorithm driven by context-based pixel prediction. By analogy with auto-encoders, we propose Context Encoders - a convolutional neural network trained to generate the contents of an arbitrary image region conditioned on its surroundings. In order t...

Full description

Saved in:
Bibliographic Details
Published in2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2536 - 2544
Main Authors Pathak, Deepak, Krahenbuhl, Philipp, Donahue, Jeff, Darrell, Trevor, Efros, Alexei A.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2016
Subjects
Online AccessGet full text

Cover

Loading…
Abstract We present an unsupervised visual feature learning algorithm driven by context-based pixel prediction. By analogy with auto-encoders, we propose Context Encoders - a convolutional neural network trained to generate the contents of an arbitrary image region conditioned on its surroundings. In order to succeed at this task, context encoders need to both understand the content of the entire image, as well as produce a plausible hypothesis for the missing part(s). When training context encoders, we have experimented with both a standard pixel-wise reconstruction loss, as well as a reconstruction plus an adversarial loss. The latter produces much sharper results because it can better handle multiple modes in the output. We found that a context encoder learns a representation that captures not just appearance but also the semantics of visual structures. We quantitatively demonstrate the effectiveness of our learned features for CNN pre-training on classification, detection, and segmentation tasks. Furthermore, context encoders can be used for semantic inpainting tasks, either stand-alone or as initialization for non-parametric methods.
AbstractList We present an unsupervised visual feature learning algorithm driven by context-based pixel prediction. By analogy with auto-encoders, we propose Context Encoders - a convolutional neural network trained to generate the contents of an arbitrary image region conditioned on its surroundings. In order to succeed at this task, context encoders need to both understand the content of the entire image, as well as produce a plausible hypothesis for the missing part(s). When training context encoders, we have experimented with both a standard pixel-wise reconstruction loss, as well as a reconstruction plus an adversarial loss. The latter produces much sharper results because it can better handle multiple modes in the output. We found that a context encoder learns a representation that captures not just appearance but also the semantics of visual structures. We quantitatively demonstrate the effectiveness of our learned features for CNN pre-training on classification, detection, and segmentation tasks. Furthermore, context encoders can be used for semantic inpainting tasks, either stand-alone or as initialization for non-parametric methods.
Author Pathak, Deepak
Krahenbuhl, Philipp
Donahue, Jeff
Darrell, Trevor
Efros, Alexei A.
Author_xml – sequence: 1
  givenname: Deepak
  surname: Pathak
  fullname: Pathak, Deepak
  email: pathak@cs.berkeley.edu
  organization: Univ. of California, Berkeley, Berkeley, CA, USA
– sequence: 2
  givenname: Philipp
  surname: Krahenbuhl
  fullname: Krahenbuhl, Philipp
  email: philkr@cs.berkeley.edu
  organization: Univ. of California, Berkeley, Berkeley, CA, USA
– sequence: 3
  givenname: Jeff
  surname: Donahue
  fullname: Donahue, Jeff
  email: jdonahue@cs.berkeley.edu
  organization: Univ. of California, Berkeley, Berkeley, CA, USA
– sequence: 4
  givenname: Trevor
  surname: Darrell
  fullname: Darrell, Trevor
  email: trevor@cs.berkeley.edu
  organization: Univ. of California, Berkeley, Berkeley, CA, USA
– sequence: 5
  givenname: Alexei A.
  surname: Efros
  fullname: Efros, Alexei A.
  email: efros@cs.berkeley.edu
  organization: Univ. of California, Berkeley, Berkeley, CA, USA
BookMark eNotjMFKw0AQQFdRsNYcPXnZH0icyWZ3drxJaLUQUES9lk0ykYhuShLB_n0LenmPd3mX6iwOUZS6RsgQgW_L9-eXLAd0WU7-RCVMHgtHxnuLeKoWCM6kjpEvVDJNnwCA7Dx6XigohzjL76xXsRlaGac7vZYw_4yiKwlj7OOHrvd6E3ehj_OxrtR5F74mSf69VG_r1Wv5mFZPD5vyvkp7JDuntZOcamtcZziXorbE3nF3pDXQGgqWyDVcCIrDru2CAdOw1NTahqEFs1Q3f99eRLa7sf8O435L5MEVZA753kS7
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/CVPR.2016.278
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
Computer Science
EISBN 9781467388511
1467388513
EISSN 1063-6919
EndPage 2544
ExternalDocumentID 7780647
Genre orig-research
GroupedDBID 23M
29F
29O
6IE
6IH
6IK
ABDPE
ACGFS
ALMA_UNASSIGNED_HOLDINGS
CBEJK
IPLJI
M43
RIE
RIO
RNS
ID FETCH-LOGICAL-i175t-b6e27b536f392e4b579869f798530d37a5776c94e1e61fdfa303c9eb7d5c90d03
IEDL.DBID RIE
IngestDate Wed Aug 27 01:54:52 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-b6e27b536f392e4b579869f798530d37a5776c94e1e61fdfa303c9eb7d5c90d03
PageCount 9
ParticipantIDs ieee_primary_7780647
PublicationCentury 2000
PublicationDate 2016-June
PublicationDateYYYYMMDD 2016-06-01
PublicationDate_xml – month: 06
  year: 2016
  text: 2016-June
PublicationDecade 2010
PublicationTitle 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
PublicationTitleAbbrev CVPR
PublicationYear 2016
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0001968189
ssj0023720
ssj0003211698
Score 2.6026924
Snippet We present an unsupervised visual feature learning algorithm driven by context-based pixel prediction. By analogy with auto-encoders, we propose Context...
SourceID ieee
SourceType Publisher
StartPage 2536
SubjectTerms Computer architecture
Context
Convolutional codes
Decoding
Image reconstruction
Semantics
Visualization
Title Context Encoders: Feature Learning by Inpainting
URI https://ieeexplore.ieee.org/document/7780647
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1BS8MwFH5sO3mauok6lRw82q5Nm2TxOjamMBniZLeRtK8yhG5sLai_3qTtOhEPXkr7IDQkJN97yfveB3BrgwikPHGoMltgKGLlaEUTRwmDfdTnkieW7zx94pN5-Lhgiwbc1VwYRCySz9C1r8VdfryOcntU1hdiYLmRTWiawK3kah3OUyQ32CPr78BENlzWNwrUqrEcamz2h6-zZ5vYxV1qFdZ-KKsUwDJuw3TfpTKf5N3NM-1GX7-qNf63z8fQPVD4yKwGpxNoYHoK7crnJNWK3hnTXtZhb-uAV1Ss-sjIKLWE9-3unlhHMd8iqaqxvhH9SR7SjVoVOhNdmI9HL8OJUwkrOCvjLWSO5kiFZgFPjHeEoWZCDrhMzJMFXhwIxYTgkQzRR-4ncaIMzkUStYhZJL3YC86gla5TPAeiA9PaeI06lCawElwNNHqMUR745icML6Bjx2S5KWtnLKvhuPzb3IMjOydlKtYVtLJtjtcG9DN9U8z2NxoWp44
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwFH5BPOgJFYy_7cGjG6PbWuqVQFCBEAOGG2m3N0NMBoEtUf96220MYzx4WbaXLGv60r3vte97H8CdSSKQssiiUv8CPR5KS0kaWZLr2EdbTLDI8J2HI9afek8zf1aB-5ILg4hZ8Rna5jY7yw-XQWq2ypqctw03cg_2ddz3ac7W2u2oCKajjyifXZ3bMFGeKVCjx7LrstnsvI5fTGkXs6nRWPuhrZKFll4NhttB5RUl73aaKDv4-tWv8b-jPoLGjsRHxmV4OoYKxidQK1AnKdb0Rpu2wg5bWx2crGfVR0K6saG8rzcPxEDFdI2k6Mf6RtQneYxXcpEpTTRg2utOOn2rkFawFhovJJZiSLnyXRZpfISe8rloMxHpq-86oculzzkLhIctZK0ojKSOdIFAxUM_EE7ouKdQjZcxngFRrn5b40blCZ1acSbbCh3tHea29Ed8PIe6mZP5Ku-eMS-m4-Jv8y0c9CfDwXzwOHq-hEPjn7ww6wqqyTrFaw0BEnWTef4bD4Kq2A
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2016+IEEE+Conference+on+Computer+Vision+and+Pattern+Recognition+%28CVPR%29&rft.atitle=Context+Encoders%3A+Feature+Learning+by+Inpainting&rft.au=Pathak%2C+Deepak&rft.au=Krahenbuhl%2C+Philipp&rft.au=Donahue%2C+Jeff&rft.au=Darrell%2C+Trevor&rft.date=2016-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=2536&rft.epage=2544&rft_id=info:doi/10.1109%2FCVPR.2016.278&rft.externalDocID=7780647