Momentum Contrast for Unsupervised Visual Representation Learning
We present Momentum Contrast (MoCo) for unsupervised visual representation learning. From a perspective on contrastive learning as dictionary look-up, we build a dynamic dictionary with a queue and a moving-averaged encoder. This enables building a large and consistent dictionary on-the-fly that fac...
Saved in:
Published in | Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) pp. 9726 - 9735 |
---|---|
Main Authors | , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.06.2020
|
Subjects | |
Online Access | Get full text |
ISSN | 1063-6919 |
DOI | 10.1109/CVPR42600.2020.00975 |
Cover
Abstract | We present Momentum Contrast (MoCo) for unsupervised visual representation learning. From a perspective on contrastive learning as dictionary look-up, we build a dynamic dictionary with a queue and a moving-averaged encoder. This enables building a large and consistent dictionary on-the-fly that facilitates contrastive unsupervised learning. MoCo provides competitive results under the common linear protocol on ImageNet classification. More importantly, the representations learned by MoCo transfer well to downstream tasks. MoCo can outperform its supervised pre-training counterpart in 7 detection/segmentation tasks on PASCAL VOC, COCO, and other datasets, sometimes surpassing it by large margins. This suggests that the gap between unsupervised and supervised representation learning has been largely closed in many vision tasks. |
---|---|
AbstractList | We present Momentum Contrast (MoCo) for unsupervised visual representation learning. From a perspective on contrastive learning as dictionary look-up, we build a dynamic dictionary with a queue and a moving-averaged encoder. This enables building a large and consistent dictionary on-the-fly that facilitates contrastive unsupervised learning. MoCo provides competitive results under the common linear protocol on ImageNet classification. More importantly, the representations learned by MoCo transfer well to downstream tasks. MoCo can outperform its supervised pre-training counterpart in 7 detection/segmentation tasks on PASCAL VOC, COCO, and other datasets, sometimes surpassing it by large margins. This suggests that the gap between unsupervised and supervised representation learning has been largely closed in many vision tasks. |
Author | Xie, Saining Fan, Haoqi Wu, Yuxin Girshick, Ross He, Kaiming |
Author_xml | – sequence: 1 givenname: Kaiming surname: He fullname: He, Kaiming organization: Facebook AI Research (FAIR) – sequence: 2 givenname: Haoqi surname: Fan fullname: Fan, Haoqi organization: Facebook AI Research (FAIR) – sequence: 3 givenname: Yuxin surname: Wu fullname: Wu, Yuxin organization: Facebook AI Research (FAIR) – sequence: 4 givenname: Saining surname: Xie fullname: Xie, Saining organization: Facebook AI Research (FAIR) – sequence: 5 givenname: Ross surname: Girshick fullname: Girshick, Ross organization: Facebook AI Research (FAIR) |
BookMark | eNotjM1KxDAUhaMoOI59Al3kBTreJO1NsxyKf1BRBme2Q0pvJDJNS9IKvr0F5XD4Nuc71-wiDIEYuxOwEQLMfX143xUSATYSJGwAjC7PWGZ0JbRcKrAqz9lKAKocjTBXLEvpCwCUFAJNtWLb16GnMM09r4cwRZsm7obI9yHNI8Vvn6jjB59me-I7GiOlZWwnPwTekI3Bh88bdunsKVH2zzXbPz581M958_b0Um-b3EtQU647jbKVgEVBHVVWtE6hMVUhqdToULklBF3nlDCuJYltaVpQptWosXBqzW7_fj0RHcfoext_jkYstkL1C-8UTZY |
CODEN | IEEPAD |
ContentType | Conference Proceeding |
DBID | 6IE 6IH CBEJK RIE RIO |
DOI | 10.1109/CVPR42600.2020.00975 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEL IEEE Proceedings Order Plans (POP) 1998-present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Applied Sciences |
EISBN | 9781728171685 1728171687 |
EISSN | 1063-6919 |
EndPage | 9735 |
ExternalDocumentID | 9157636 |
Genre | orig-research |
GroupedDBID | 6IE 6IH 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP OCL RIE RIL RIO |
ID | FETCH-LOGICAL-i203t-7d762b20644ede8a1bf3699842e576f63f3f3e0ddf319fbe26b59b039b76764f3 |
IEDL.DBID | RIE |
IngestDate | Wed Aug 27 02:30:34 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i203t-7d762b20644ede8a1bf3699842e576f63f3f3e0ddf319fbe26b59b039b76764f3 |
PageCount | 10 |
ParticipantIDs | ieee_primary_9157636 |
PublicationCentury | 2000 |
PublicationDate | 2020-Jun |
PublicationDateYYYYMMDD | 2020-06-01 |
PublicationDate_xml | – month: 06 year: 2020 text: 2020-Jun |
PublicationDecade | 2020 |
PublicationTitle | Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) |
PublicationTitleAbbrev | CVPR |
PublicationYear | 2020 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0003211698 |
Score | 2.6669614 |
Snippet | We present Momentum Contrast (MoCo) for unsupervised visual representation learning. From a perspective on contrastive learning as dictionary look-up, we build... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 9726 |
SubjectTerms | Buildings Dictionaries Loss measurement Task analysis Training Unsupervised learning Visualization |
Title | Momentum Contrast for Unsupervised Visual Representation Learning |
URI | https://ieeexplore.ieee.org/document/9157636 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1BS8MwGA3bTp6mbqJOJQePdmuTNF2OMhxDmIzhxm6jab7IULuxthd_vV_aOkU8SC-hFBKSJt_7kvdeCLlNYsMTPwTPJiLyBAfjaRAI5LQSLI6tryqW75OcLMTjKlw1yN1BCwMAJfkM-q5YnuWbbVK4rbKBChAdc9kkTfzNKq3WYT-FYyYj1bBWxwW-GoyWs3npv45ZIHMELuXIhD_uUClDyLhNpl-VV8yR136R637y8cuX8b-tOybdb7EenR3C0AlpQHpK2jW6pPXczTrkfurMFvLinTpHqn2c5RQBK12kWbFzC0aGXy83WRG_0XlJj61VSSmtPVhfumQxfngeTbz6AgVvw3yee5HBpU4zRB0CDAzjQFsuMb8SDLCpVnKLD_jGWJyIVgOTOlTa50pHMpLC8jPSSrcpnBOaKBNwG3Cp40A4ixcBoTUIDllkA5OwC9JxPbLeVR4Z67ozLv9-3SNHbkwqytUVaeX7Aq4xuOf6phzVT81apTc |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwFH5BPOgJFYy_7cGjw23tOno0RIIKxBgg3Mi6vhqiDsK2i3-97TbRGA-ml6VZsqYv7fte931fAa7iSNHYDdDRMQsdRlE5EpkBclIwP4q0K0qW74j3J-xhFsxqcL3RwiBiQT7Dtn0s_uWrZZzbo7Ib4Rl0TPkWbJu8z4JSrbU5UaGmluGiU-njPFfcdKdPz4UDu6kDfUvhEpZO-OMWlSKJ9Bow_Pp8yR15beeZbMcfv5wZ_zu-PWh9y_XI0yYR7UMNkwNoVPiSVKs3bcLt0NotZPk7sZ5U6yjNiIGsZJKk-cpuGal5e7pI8-iNPBcE2UqXlJDKhfWlBZPe3bjbd6orFJyF79LMCZXZ7KRvcAdDhZ3Ik5pyU2ExH81QNafaNHSV0mYpaok-l4GQLhUy5CFnmh5CPVkmeAQkFsqj2qNcRh6zJi8MA60MPPRD7anYP4amnZH5qnTJmFeTcfJ39yXs9MfDwXxwP3o8hV0bn5KAdQb1bJ3juUn1mbwoIvwJpRaohA |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=Momentum+Contrast+for+Unsupervised+Visual+Representation+Learning&rft.au=He%2C+Kaiming&rft.au=Fan%2C+Haoqi&rft.au=Wu%2C+Yuxin&rft.au=Xie%2C+Saining&rft.date=2020-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=9726&rft.epage=9735&rft_id=info:doi/10.1109%2FCVPR42600.2020.00975&rft.externalDocID=9157636 |