Learning Geometry-aware Representations by Sketching

Understanding geometric concepts, such as distance and shape, is essential for understanding the real world and also for many vision tasks. To incorporate such information into a visual representation of a scene, we propose learning to represent the scene by sketching, inspired by human behavior. Ou...

Full description

Saved in:
Bibliographic Details
Published in2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 23315 - 23326
Main Authors Lee, Hyundo, Hwang, Inwoo, Go, Hyunsung, Choi, Won-Seok, Kim, Kibeom, Zhang, Byoung- Tak
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2023
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Understanding geometric concepts, such as distance and shape, is essential for understanding the real world and also for many vision tasks. To incorporate such information into a visual representation of a scene, we propose learning to represent the scene by sketching, inspired by human behavior. Our method, coined Learning by Sketching (LBS), learns to convert an image into a set of colored strokes that explicitly incorporate the geometric information of the scene in a single inference step without requiring a sketch dataset. A sketch is then generated from the strokes where CLIP-based perceptual loss maintains a semantic similarity between the sketch and the image. We show theoretically that sketching is equivariant with respect to arbitrary affine transformations and thus provably preserves geometric information. Experimental results show that LBS substantially improves the performance of object attribute classification on the unlabeled CLEVR dataset, domain transfer between CLEVR and STL-10 datasets, and for diverse downstream tasks, confirming that LBS provides rich geometric information.
AbstractList Understanding geometric concepts, such as distance and shape, is essential for understanding the real world and also for many vision tasks. To incorporate such information into a visual representation of a scene, we propose learning to represent the scene by sketching, inspired by human behavior. Our method, coined Learning by Sketching (LBS), learns to convert an image into a set of colored strokes that explicitly incorporate the geometric information of the scene in a single inference step without requiring a sketch dataset. A sketch is then generated from the strokes where CLIP-based perceptual loss maintains a semantic similarity between the sketch and the image. We show theoretically that sketching is equivariant with respect to arbitrary affine transformations and thus provably preserves geometric information. Experimental results show that LBS substantially improves the performance of object attribute classification on the unlabeled CLEVR dataset, domain transfer between CLEVR and STL-10 datasets, and for diverse downstream tasks, confirming that LBS provides rich geometric information.
Author Choi, Won-Seok
Lee, Hyundo
Hwang, Inwoo
Go, Hyunsung
Kim, Kibeom
Zhang, Byoung- Tak
Author_xml – sequence: 1
  givenname: Hyundo
  surname: Lee
  fullname: Lee, Hyundo
  email: hdlee@bi.snu.ac.kr
  organization: AI Institute, Seoul National University
– sequence: 2
  givenname: Inwoo
  surname: Hwang
  fullname: Hwang, Inwoo
  email: iwhwang@bi.snu.ac.kr
  organization: AI Institute, Seoul National University
– sequence: 3
  givenname: Hyunsung
  surname: Go
  fullname: Go, Hyunsung
  email: hsgo@bi.snu.ac.kr
  organization: AI Institute, Seoul National University
– sequence: 4
  givenname: Won-Seok
  surname: Choi
  fullname: Choi, Won-Seok
  email: wchoi@bi.snu.ac.kr
  organization: AI Institute, Seoul National University
– sequence: 5
  givenname: Kibeom
  surname: Kim
  fullname: Kim, Kibeom
  email: kbkim@bi.snu.ac.kr
  organization: AI Institute, Seoul National University
– sequence: 6
  givenname: Byoung- Tak
  surname: Zhang
  fullname: Zhang, Byoung- Tak
  email: btzhang@bi.snu.ac.kr
  organization: AI Institute, Seoul National University
BookMark eNotjstKxDAUQKMoOI79g1n0B1pvbpomWUrRGaGgjI_tkDQ3WnTSIS1I_96Crs7mcDjX7CIOkRjbcCg5B3PbvD_vJSo0JQKKEhCFOGOZUUYLCQI4Gn3OViiVLBQoecWycewdSARQwugVq1qyKfbxI9_ScKQpzYX9sYnyPZ0SjRQnO_VDHHM35y9fNHWfi3vDLoP9Hin755q9Pdy_Nruifdo-Nndt0aOEqQhCaaWh5lx7LwPW0pJ16JY513WoTPBC1qHWlVYUTKfQV1B76VUFwbpKrNnmr9sT0eGU-qNN84EDguDL_y-pm0jZ
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/CVPR52729.2023.02233
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Xplore Digital Library
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore Digital Library
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9798350301298
EISSN 2575-7075
EndPage 23326
ExternalDocumentID 10203107
Genre orig-research
GroupedDBID 6IE
6IH
6IL
6IN
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
OCL
RIE
RIL
RIO
ID FETCH-LOGICAL-i250t-f3787806118dd5f265aeab2b983bcc279fd356f68487ef9c72d406d5d740fab43
IEDL.DBID RIE
IngestDate Wed Jun 26 19:26:17 EDT 2024
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i250t-f3787806118dd5f265aeab2b983bcc279fd356f68487ef9c72d406d5d740fab43
OpenAccessLink http://arxiv.org/pdf/2304.08204
PageCount 12
ParticipantIDs ieee_primary_10203107
PublicationCentury 2000
PublicationDate 2023-June
PublicationDateYYYYMMDD 2023-06-01
PublicationDate_xml – month: 06
  year: 2023
  text: 2023-June
PublicationDecade 2020
PublicationTitle 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
PublicationTitleAbbrev CVPR
PublicationYear 2023
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib052007398
ssib042469789
Score 2.2963426
Snippet Understanding geometric concepts, such as distance and shape, is essential for understanding the real world and also for many vision tasks. To incorporate such...
SourceID ieee
SourceType Publisher
StartPage 23315
SubjectTerms Focusing
Representation learning
Scalability
Self-supervised or unsupervised representation learning
Semantics
Shape
Training
Visualization
Title Learning Geometry-aware Representations by Sketching
URI https://ieeexplore.ieee.org/document/10203107
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA62J08qVnyzB69Zt3lsknOxFsFSqpXeSh4TkeJW6hapv95kd6tFELyFkMMkk-HLY75vELoCkRFlFWDJwWGWU411LhmmzktubaaZjUTh-2E-mLC7KZ82ZPWKCwMAVfIZpLFZ_eW7hV3Fp7IQ4SQqWYoWagmlarLWZvMwEi56W9LpUU5IUCUbulw3U9e9p9GYk3CaTGPN8DSgVyyXu1VUpcKU_h4abqypU0nm6ao0qf38JdT4b3P3UeeHvpeMvoHpAO1AcYhYo6T6nNzC4hXK5RrrD72EZFzlwjYUpOI9MevkYR6dGcZ20KR_89gb4KZkAn4JZ5kSexoCUAaM7krnuCc516ANMUpSYy0RyjvKcx-8IQV4ZQVxAdEdd4JlXhtGj1C7WBRwjBJhWegJYCWiBKDONWNCaQYKQphLRk9QJ0559larYsw2sz39o_8M7cZlr9OszlG7XK7gIgB6aS4rR34Byd-d2A
link.rule.ids 310,311,783,787,792,793,799,27939,55088
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NSwMxEA1aD3pSseK3e_CadZuPTXIu1qptKbWV3ko2mRUpbqVukfrrTXa3WgTBWwg5TJgMb5LMe4PQFYiIKKMASw4Ws5hqrGPJMLWp5MZEmhlPFO724vaI3Y_5uCKrF1wYACiKzyD0w-Iv387Mwj-VuQgnXslSbKIt7hOLkq61Oj6MuKvemni6FxQSVMmKMNeI1HXzqT_gxOWToe8aHjr88g1z19qqFKjS2kW9lT1lMck0XORJaD5_STX-2-A9VP8h8AX9b2jaRxuQHSBWaak-B7cwe4V8vsT6Q88hGBTVsBUJKXsPkmXwOPXudGvraNS6GTbbuGqagF9cNpPjlLoQlA6lG9JanpKYa9AJSZSkiTFEqNRSHqfOH1JAqowg1mG65VawKNUJo4eols0yOEKBMMzNOLgSXgRQx5oxoTQDBS7QJaPHqO63PHkrdTEmq92e_DF_ibbbw25n0rnrPZyiHe-CsujqDNXy-QLOHbznyUXh1C9D9qEl
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2023+IEEE%2FCVF+Conference+on+Computer+Vision+and+Pattern+Recognition+%28CVPR%29&rft.atitle=Learning+Geometry-aware+Representations+by+Sketching&rft.au=Lee%2C+Hyundo&rft.au=Hwang%2C+Inwoo&rft.au=Go%2C+Hyunsung&rft.au=Choi%2C+Won-Seok&rft.date=2023-06-01&rft.pub=IEEE&rft.eissn=2575-7075&rft.spage=23315&rft.epage=23326&rft_id=info:doi/10.1109%2FCVPR52729.2023.02233&rft.externalDocID=10203107