Learning Geometry-aware Representations by Sketching
Understanding geometric concepts, such as distance and shape, is essential for understanding the real world and also for many vision tasks. To incorporate such information into a visual representation of a scene, we propose learning to represent the scene by sketching, inspired by human behavior. Ou...
Saved in:
Published in | 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 23315 - 23326 |
---|---|
Main Authors | , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.06.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Understanding geometric concepts, such as distance and shape, is essential for understanding the real world and also for many vision tasks. To incorporate such information into a visual representation of a scene, we propose learning to represent the scene by sketching, inspired by human behavior. Our method, coined Learning by Sketching (LBS), learns to convert an image into a set of colored strokes that explicitly incorporate the geometric information of the scene in a single inference step without requiring a sketch dataset. A sketch is then generated from the strokes where CLIP-based perceptual loss maintains a semantic similarity between the sketch and the image. We show theoretically that sketching is equivariant with respect to arbitrary affine transformations and thus provably preserves geometric information. Experimental results show that LBS substantially improves the performance of object attribute classification on the unlabeled CLEVR dataset, domain transfer between CLEVR and STL-10 datasets, and for diverse downstream tasks, confirming that LBS provides rich geometric information. |
---|---|
AbstractList | Understanding geometric concepts, such as distance and shape, is essential for understanding the real world and also for many vision tasks. To incorporate such information into a visual representation of a scene, we propose learning to represent the scene by sketching, inspired by human behavior. Our method, coined Learning by Sketching (LBS), learns to convert an image into a set of colored strokes that explicitly incorporate the geometric information of the scene in a single inference step without requiring a sketch dataset. A sketch is then generated from the strokes where CLIP-based perceptual loss maintains a semantic similarity between the sketch and the image. We show theoretically that sketching is equivariant with respect to arbitrary affine transformations and thus provably preserves geometric information. Experimental results show that LBS substantially improves the performance of object attribute classification on the unlabeled CLEVR dataset, domain transfer between CLEVR and STL-10 datasets, and for diverse downstream tasks, confirming that LBS provides rich geometric information. |
Author | Choi, Won-Seok Lee, Hyundo Hwang, Inwoo Go, Hyunsung Kim, Kibeom Zhang, Byoung- Tak |
Author_xml | – sequence: 1 givenname: Hyundo surname: Lee fullname: Lee, Hyundo email: hdlee@bi.snu.ac.kr organization: AI Institute, Seoul National University – sequence: 2 givenname: Inwoo surname: Hwang fullname: Hwang, Inwoo email: iwhwang@bi.snu.ac.kr organization: AI Institute, Seoul National University – sequence: 3 givenname: Hyunsung surname: Go fullname: Go, Hyunsung email: hsgo@bi.snu.ac.kr organization: AI Institute, Seoul National University – sequence: 4 givenname: Won-Seok surname: Choi fullname: Choi, Won-Seok email: wchoi@bi.snu.ac.kr organization: AI Institute, Seoul National University – sequence: 5 givenname: Kibeom surname: Kim fullname: Kim, Kibeom email: kbkim@bi.snu.ac.kr organization: AI Institute, Seoul National University – sequence: 6 givenname: Byoung- Tak surname: Zhang fullname: Zhang, Byoung- Tak email: btzhang@bi.snu.ac.kr organization: AI Institute, Seoul National University |
BookMark | eNotjstKxDAUQKMoOI79g1n0B1pvbpomWUrRGaGgjI_tkDQ3WnTSIS1I_96Crs7mcDjX7CIOkRjbcCg5B3PbvD_vJSo0JQKKEhCFOGOZUUYLCQI4Gn3OViiVLBQoecWycewdSARQwugVq1qyKfbxI9_ScKQpzYX9sYnyPZ0SjRQnO_VDHHM35y9fNHWfi3vDLoP9Hin755q9Pdy_Nruifdo-Nndt0aOEqQhCaaWh5lx7LwPW0pJ16JY513WoTPBC1qHWlVYUTKfQV1B76VUFwbpKrNnmr9sT0eGU-qNN84EDguDL_y-pm0jZ |
CODEN | IEEPAD |
ContentType | Conference Proceeding |
DBID | 6IE 6IH CBEJK RIE RIO |
DOI | 10.1109/CVPR52729.2023.02233 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore Digital Library IEEE Proceedings Order Plans (POP) 1998-present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore Digital Library url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
EISBN | 9798350301298 |
EISSN | 2575-7075 |
EndPage | 23326 |
ExternalDocumentID | 10203107 |
Genre | orig-research |
GroupedDBID | 6IE 6IH 6IL 6IN ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP OCL RIE RIL RIO |
ID | FETCH-LOGICAL-i250t-f3787806118dd5f265aeab2b983bcc279fd356f68487ef9c72d406d5d740fab43 |
IEDL.DBID | RIE |
IngestDate | Wed Jun 26 19:26:17 EDT 2024 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i250t-f3787806118dd5f265aeab2b983bcc279fd356f68487ef9c72d406d5d740fab43 |
OpenAccessLink | http://arxiv.org/pdf/2304.08204 |
PageCount | 12 |
ParticipantIDs | ieee_primary_10203107 |
PublicationCentury | 2000 |
PublicationDate | 2023-June |
PublicationDateYYYYMMDD | 2023-06-01 |
PublicationDate_xml | – month: 06 year: 2023 text: 2023-June |
PublicationDecade | 2020 |
PublicationTitle | 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) |
PublicationTitleAbbrev | CVPR |
PublicationYear | 2023 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssib052007398 ssib042469789 |
Score | 2.2963426 |
Snippet | Understanding geometric concepts, such as distance and shape, is essential for understanding the real world and also for many vision tasks. To incorporate such... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 23315 |
SubjectTerms | Focusing Representation learning Scalability Self-supervised or unsupervised representation learning Semantics Shape Training Visualization |
Title | Learning Geometry-aware Representations by Sketching |
URI | https://ieeexplore.ieee.org/document/10203107 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA62J08qVnyzB69Zt3lsknOxFsFSqpXeSh4TkeJW6hapv95kd6tFELyFkMMkk-HLY75vELoCkRFlFWDJwWGWU411LhmmzktubaaZjUTh-2E-mLC7KZ82ZPWKCwMAVfIZpLFZ_eW7hV3Fp7IQ4SQqWYoWagmlarLWZvMwEi56W9LpUU5IUCUbulw3U9e9p9GYk3CaTGPN8DSgVyyXu1VUpcKU_h4abqypU0nm6ao0qf38JdT4b3P3UeeHvpeMvoHpAO1AcYhYo6T6nNzC4hXK5RrrD72EZFzlwjYUpOI9MevkYR6dGcZ20KR_89gb4KZkAn4JZ5kSexoCUAaM7krnuCc516ANMUpSYy0RyjvKcx-8IQV4ZQVxAdEdd4JlXhtGj1C7WBRwjBJhWegJYCWiBKDONWNCaQYKQphLRk9QJ0559larYsw2sz39o_8M7cZlr9OszlG7XK7gIgB6aS4rR34Byd-d2A |
link.rule.ids | 310,311,783,787,792,793,799,27939,55088 |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NSwMxEA1aD3pSseK3e_CadZuPTXIu1qptKbWV3ko2mRUpbqVukfrrTXa3WgTBWwg5TJgMb5LMe4PQFYiIKKMASw4Ws5hqrGPJMLWp5MZEmhlPFO724vaI3Y_5uCKrF1wYACiKzyD0w-Iv387Mwj-VuQgnXslSbKIt7hOLkq61Oj6MuKvemni6FxQSVMmKMNeI1HXzqT_gxOWToe8aHjr88g1z19qqFKjS2kW9lT1lMck0XORJaD5_STX-2-A9VP8h8AX9b2jaRxuQHSBWaak-B7cwe4V8vsT6Q88hGBTVsBUJKXsPkmXwOPXudGvraNS6GTbbuGqagF9cNpPjlLoQlA6lG9JanpKYa9AJSZSkiTFEqNRSHqfOH1JAqowg1mG65VawKNUJo4eols0yOEKBMMzNOLgSXgRQx5oxoTQDBS7QJaPHqO63PHkrdTEmq92e_DF_ibbbw25n0rnrPZyiHe-CsujqDNXy-QLOHbznyUXh1C9D9qEl |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2023+IEEE%2FCVF+Conference+on+Computer+Vision+and+Pattern+Recognition+%28CVPR%29&rft.atitle=Learning+Geometry-aware+Representations+by+Sketching&rft.au=Lee%2C+Hyundo&rft.au=Hwang%2C+Inwoo&rft.au=Go%2C+Hyunsung&rft.au=Choi%2C+Won-Seok&rft.date=2023-06-01&rft.pub=IEEE&rft.eissn=2575-7075&rft.spage=23315&rft.epage=23326&rft_id=info:doi/10.1109%2FCVPR52729.2023.02233&rft.externalDocID=10203107 |