Do Vision and Language Models Share Concepts? A Vector Space Alignment Study
Large-scale pretrained language models (LMs) are said to “lack the ability to connect utterances to the world” (Bender and Koller, ), because they do not have “mental models of the world” (Mitchell and Krakauer, ). If so, one would expect LM representations to be unrelated to representations induced...
Saved in:
Published in | Transactions of the Association for Computational Linguistics Vol. 12; pp. 1232 - 1249 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
255 Main Street, 9th Floor, Cambridge, Massachusetts 02142, USA
MIT Press
30.09.2024
The MIT Press |
Online Access | Get full text |
ISSN | 2307-387X 2307-387X |
DOI | 10.1162/tacl_a_00698 |
Cover
Abstract | Large-scale pretrained language models (LMs) are said to “lack the ability to connect utterances to the world” (Bender and Koller,
), because they do not have “mental models of the world” (Mitchell and Krakauer,
). If so, one would expect LM representations to be unrelated to representations induced by vision models. We present an empirical evaluation across four families of LMs (BERT, GPT-2, OPT, and LLaMA-2) and three vision model architectures (ResNet, SegFormer, and MAE). Our experiments show that LMs partially converge towards representations isomorphic to those of vision models, subject to dispersion, polysemy, and frequency. This has important implications for
multi-modal processing and the LM understanding debate (Mitchell and Krakauer,
). |
---|---|
AbstractList | Large-scale pretrained language models (LMs) are said to “lack the ability to connect utterances to the world” (Bender and Koller, 2020), because they do not have “mental models of the world” (Mitchell and Krakauer, 2023). If so, one would expect LM representations to be unrelated to representations induced by vision models. We present an empirical evaluation across four families of LMs (BERT, GPT-2, OPT, and LLaMA-2) and three vision model architectures (ResNet, SegFormer, and MAE). Our experiments show that LMs partially converge towards representations isomorphic to those of vision models, subject to dispersion, polysemy, and frequency. This has important implications for both multi-modal processing and the LM understanding debate (Mitchell and Krakauer, 2023).1 Large-scale pretrained language models (LMs) are said to “lack the ability to connect utterances to the world” (Bender and Koller, ), because they do not have “mental models of the world” (Mitchell and Krakauer, ). If so, one would expect LM representations to be unrelated to representations induced by vision models. We present an empirical evaluation across four families of LMs (BERT, GPT-2, OPT, and LLaMA-2) and three vision model architectures (ResNet, SegFormer, and MAE). Our experiments show that LMs partially converge towards representations isomorphic to those of vision models, subject to dispersion, polysemy, and frequency. This has important implications for multi-modal processing and the LM understanding debate (Mitchell and Krakauer, ). |
Author | Fierro, Constanza Søgaard, Anders Li, Jiaang Kementchedjhieva, Yova |
Author_xml | – sequence: 1 givenname: Jiaang surname: Li fullname: Li, Jiaang email: jili@di.ku.dk organization: University of Copenhagen, Denmark. jili@di.ku.dk – sequence: 2 givenname: Yova surname: Kementchedjhieva fullname: Kementchedjhieva, Yova email: yova.kementchedjhieva@mbzuai.ac.ae organization: Mohamed bin Zayed University of Artificial Intelligence, United Arab Emirates. yova.kementchedjhieva@mbzuai.ac.ae – sequence: 3 givenname: Constanza surname: Fierro fullname: Fierro, Constanza email: c.fierro@di.ku.dk organization: University of Copenhagen, Denmark. c.fierro@di.ku.dk – sequence: 4 givenname: Anders surname: Søgaard fullname: Søgaard, Anders organization: University of Copenhagen, Denmark. soegaard@di.ku.dk |
BookMark | eNp1kE1LAzEURYMoWGt3_oAsXVjNZGaSdCWlfkLFRUtxF94kb2rKNCmZdNF_72hFiujqPS73nsU5I8c-eCTkImPXWSb4TQLTaNCMiZE6Ij2eMznMlXw7PvhPyaBtV4yxTGWKCd4j07tAF651wVPwlk7BL7ewRPoSLDYtnb1DRDoJ3uAmtbd0TBdoUoh0tgGDdNy4pV-jT3SWtnZ3Tk5qaFocfN8-mT_czydPw-nr4_NkPB2aXPI0BJRmJAQU3CiUspBVJWspMlHVCkydlwwq20UF8rKsVAVYFyqXoCpkxpR5nzzvsTbASm-iW0Pc6QBOfwUhLjXE5EyDGlFZhsIKZUeFFKhKqGpeS17KUW4sdKyrPcvE0LYR6x9exvSnV33otavzX3XjEqROX4rgmv9Gl_vR2iW9CtvoOzl_Vz8AToyNIw |
CitedBy_id | crossref_primary_10_3390_e26121092 |
Cites_doi | 10.1007/s11023-017-9441-6 10.18653/v1/D15-1015 10.1093/oso/9780192894724.001.0001 10.1073/pnas.2215907120 10.1017/S0140525X00005756 10.1162/coli_a_00522 10.1109/CVPR.2016.90 10.1038/s42003-022-03036-1 10.18653/v1/P19-2021 10.1109/ICCV.2015.11 10.1007/s11263-015-0816-y 10.1007/978-90-481-8847-5_10 10.1609/aaai.v35i14.17524 10.1101/2020.12.02.403477 10.1016/j.artint.2012.07.001 10.1162/nol_a_00003 10.18653/v1/P18-1072 10.18653/v1/2020.acl-main.675 10.3115/v1/P14-1132 10.1162/nol_a_00087 10.1109/CVPR52688.2022.01553 10.48550/arXiv.2111.14232 10.1111/j.1746-8361.2004.tb00293.x 10.1101/2022.03.17.484712 10.1073/pnas.1907367117 10.1007/BF02289451 10.18653/v1/2020.emnlp-demos.6 10.3389/frai.2021.682578 10.18653/v1/2021.conll-1.9 10.18653/v1/2022.bigscience-1.11 10.1109/CVPR.2017.544 10.18653/v1/W18-3021 10.1101/407007 10.1073/pnas.2105646118 10.3115/v1/D14-1005 10.18653/v1/2020.acl-main.463 10.1093/oso/9780198812883.001.0001 10.1109/ICCV48922.2021.00951 10.18653/v1/D18-1047 10.1007/s10670-021-00491-w 10.1023/a:1013765011735 10.18653/v1/P16-2031 10.18653/v1/D18-1056 10.18653/v1/P18-1073 10.18653/v1/D18-1043 10.5840/monist199881427 |
ContentType | Journal Article |
DBID | AAYXX CITATION DOA |
DOI | 10.1162/tacl_a_00698 |
DatabaseName | CrossRef DOAJ Directory of Open Access Journals |
DatabaseTitle | CrossRef |
DatabaseTitleList | CrossRef |
Database_xml | – sequence: 1 dbid: DOA name: Open Access资源_DOAJ url: https://www.doaj.org/ sourceTypes: Open Website |
DeliveryMethod | fulltext_linktorsrc |
EISSN | 2307-387X |
EndPage | 1249 |
ExternalDocumentID | oai_doaj_org_article_ee8d0e6d68d9476e85abf2f725793cda 10_1162_tacl_a_00698 tacl_a_00698.pdf |
GroupedDBID | AAFWJ ABUWG AFKRA AFPKN ALMA_UNASSIGNED_HOLDINGS ALSLI ARAPS BENPR BGLVJ CCPQU CPGLG CRLPW DWQXO EBS GROUPED_DOAJ HCIFZ JMNJE K7- M~E OJV OK1 PHGZT PIMPY RMI AAYXX CITATION PHGZM PQGLB PRQQA PUEGO |
ID | FETCH-LOGICAL-c372t-ae7c966a42c8e7747bb7f7616bf8acf350abdb7f4e255b8baef4837a8be0cc53 |
IEDL.DBID | DOA |
ISSN | 2307-387X |
IngestDate | Wed Aug 27 01:18:24 EDT 2025 Tue Jul 01 03:28:36 EDT 2025 Thu Apr 24 23:08:34 EDT 2025 Thu Apr 10 09:08:59 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c372t-ae7c966a42c8e7747bb7f7616bf8acf350abdb7f4e255b8baef4837a8be0cc53 |
Notes | 2024 |
OpenAccessLink | https://doaj.org/article/ee8d0e6d68d9476e85abf2f725793cda |
PageCount | 18 |
ParticipantIDs | crossref_primary_10_1162_tacl_a_00698 doaj_primary_oai_doaj_org_article_ee8d0e6d68d9476e85abf2f725793cda mit_journals_10_1162_tacl_a_00698 crossref_citationtrail_10_1162_tacl_a_00698 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2024-09-30 |
PublicationDateYYYYMMDD | 2024-09-30 |
PublicationDate_xml | – month: 09 year: 2024 text: 2024-09-30 day: 30 |
PublicationDecade | 2020 |
PublicationPlace | 255 Main Street, 9th Floor, Cambridge, Massachusetts 02142, USA |
PublicationPlace_xml | – name: 255 Main Street, 9th Floor, Cambridge, Massachusetts 02142, USA |
PublicationTitle | Transactions of the Association for Computational Linguistics |
PublicationYear | 2024 |
Publisher | MIT Press The MIT Press |
Publisher_xml | – name: MIT Press – name: The MIT Press |
References | Fellbaum (2024100217173473200_bib16) 2010 Kiela (2024100217173473200_bib28) 2014 Orhan (2024100217173473200_bib43) 2020 Sahlgren (2024100217173473200_bib50) 2021; 4 Cappelen (2024100217173473200_bib9) 2021 Vulić (2024100217173473200_bib62) 2016 Sassenhagen (2024100217173473200_bib51) 2020; 1 Minnema (2024100217173473200_bib38) 2019 Garneau (2024100217173473200_bib18) 2021; 35 Caucheteux (2024100217173473200_bib11) 2022 Zou (2024100217173473200_bib71) 2023 Bergsma (2024100217173473200_bib5) 2011 Hoshen (2024100217173473200_bib26) 2018 Xie (2024100217173473200_bib66) 2021 He (2024100217173473200_bib24) 2022 Artetxe (2024100217173473200_bib3) 2018 Wolf (2024100217173473200_bib65) 2020 Schrimpf (2024100217173473200_bib54) 2018 2024100217173473200_bib17 Huh (2024100217173473200_bib27) 2024 Mitchell (2024100217173473200_bib39) 2023; 120 Manning (2024100217173473200_bib34) 2020; 117 Li (2024100217173473200_bib31) 2023 Radford (2024100217173473200_bib46) 2021 Zhao (2024100217173473200_bib68) 2020 Piantadosi (2024100217173473200_bib45) 2022 Abdou (2024100217173473200_bib1) 2021 Dosovitskiy (2024100217173473200_bib15) 2021 Hartmann (2024100217173473200_bib23) 2018 Mollo (2024100217173473200_bib40) 2023 Devlin (2024100217173473200_bib14) 2019 Zhou (2024100217173473200_bib69) 2017 Bender (2024100217173473200_bib4) 2020 Søgaard (2024100217173473200_bib57) 2018 Radford (2024100217173473200_bib47) 2019 Antonello (2024100217173473200_bib2) 2022 Marconi (2024100217173473200_bib35) 1997 Turc (2024100217173473200_bib61) 2019 Butlin (2024100217173473200_bib8) 2021 Lodge (2024100217173473200_bib32) 1998; 81 Merullo (2024100217173473200_bib37) 2023 Brendel (2024100217173473200_bib7) 2004; 58 Wei (2024100217173473200_bib63) 2022 Goldstein (2024100217173473200_bib20) 2021 Marcus (2024100217173473200_bib36) 2023 Shea (2024100217173473200_bib56) 2018 He (2024100217173473200_bib25) 2016 Touvron (2024100217173473200_bib60) 2023 Bird (2024100217173473200_bib6) 2009 Halvagal (2024100217173473200_bib21) 2022 Conneau (2024100217173473200_bib13) 2018 Rapaport (2024100217173473200_bib48) 2002; 12 Hartmann (2024100217173473200_bib22) 2018 Toneva (2024100217173473200_bib59) 2019; 32 Zhang (2024100217173473200_bib67) 2022 Kiela (2024100217173473200_bib29) 2015 Zhu (2024100217173473200_bib70) 2015 Russakovsky (2024100217173473200_bib49) 2015; 115 Navigli (2024100217173473200_bib42) 2012; 193 Searle (2024100217173473200_bib55) 1980; 3 Paszke (2024100217173473200_bib44) 2019 Caron (2024100217173473200_bib10) 2021 Schrimpf (2024100217173473200_bib53) 2021 Teehan (2024100217173473200_bib58) 2022 Glavaš (2024100217173473200_bib19) 2020 Williams (2024100217173473200_bib64) 2018; 28 Caucheteux (2024100217173473200_bib12) 2022 Mandelkern (2024100217173473200_bib33) 2023 Schönemann (2024100217173473200_bib52) 1966; 31 Nakashole (2024100217173473200_bib41) 2018 Lazaridou (2024100217173473200_bib30) 2014 |
References_xml | – volume: 28 start-page: 141 issue: 1 year: 2018 ident: 2024100217173473200_bib64 article-title: Predictive processing and the representation wars publication-title: Minds and Machines doi: 10.1007/s11023-017-9441-6 – volume-title: Natural language processing with Python: Analyzing text with the natural language toolkit year: 2009 ident: 2024100217173473200_bib6 – start-page: 148 volume-title: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing year: 2015 ident: 2024100217173473200_bib29 article-title: Visual bilingual lexicon induction with transferred ConvNet features doi: 10.18653/v1/D15-1015 – volume-title: Making AI Intelligible: Philosophical Foundations year: 2021 ident: 2024100217173473200_bib9 doi: 10.1093/oso/9780192894724.001.0001 – volume: 120 start-page: e2215907120 issue: 13 year: 2023 ident: 2024100217173473200_bib39 article-title: The debate over understanding in AI’s large language models publication-title: Proceedings of the National Academy of Sciences doi: 10.1073/pnas.2215907120 – volume: 3 start-page: 417 year: 1980 ident: 2024100217173473200_bib55 article-title: Minds, brains, and programs publication-title: Behavioral and Brain Sciences doi: 10.1017/S0140525X00005756 – volume-title: NeurIPS 2022 Workshop on Neuro Causal and Symbolic AI (nCSI) year: 2022 ident: 2024100217173473200_bib45 article-title: Meaning without reference in large language models – year: 2023 ident: 2024100217173473200_bib33 article-title: Do language models refer? doi: 10.1162/coli_a_00522 – start-page: 3583 volume-title: Proceedings of the 12th Language Resources and Evaluation Conference year: 2020 ident: 2024100217173473200_bib68 article-title: Non-linearity in mapping based cross-lingual word embeddings – start-page: 8748 volume-title: International Conference on Machine Learning year: 2021 ident: 2024100217173473200_bib46 article-title: Learning transferable visual models from natural language supervision – start-page: 770 volume-title: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition year: 2016 ident: 2024100217173473200_bib25 article-title: Deep residual learning for image recognition doi: 10.1109/CVPR.2016.90 – year: 2022 ident: 2024100217173473200_bib12 article-title: Brains and algorithms partially converge in natural language processing publication-title: Communications Biology doi: 10.1038/s42003-022-03036-1 – start-page: 12077 volume-title: Advances in Neural Information Processing Systems year: 2021 ident: 2024100217173473200_bib66 article-title: Segformer: Simple and efficient design for semantic segmentation with transformers – start-page: 155 volume-title: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop year: 2019 ident: 2024100217173473200_bib38 article-title: From brain space to distributional space: The perilous journeys of fMRI decoding doi: 10.18653/v1/P19-2021 – year: 2024 ident: 2024100217173473200_bib27 article-title: The platonic representation hypothesis publication-title: arXiv preprint arXiv:2405.07987 – volume-title: NeurIPS 2023 Workshop on Symmetry and Geometry in Neural Representations year: 2023 ident: 2024100217173473200_bib31 article-title: Structural similarities between language models and neural response measurements – start-page: 19 year: 2015 ident: 2024100217173473200_bib70 article-title: Aligning books and movies: Towards story-like visual explanations by watching movies and reading books publication-title: 2015 IEEE International Conference on Computer Vision (ICCV) doi: 10.1109/ICCV.2015.11 – volume: 115 start-page: 211 issue: 3 year: 2015 ident: 2024100217173473200_bib49 article-title: ImageNet large scale visual recognition challenge publication-title: International Journal of Computer Vision (IJCV) doi: 10.1007/s11263-015-0816-y – start-page: 231 volume-title: Theory and Applications of Ontology: Computer Applications year: 2010 ident: 2024100217173473200_bib16 article-title: Wordnet doi: 10.1007/978-90-481-8847-5_10 – volume: 35 start-page: 12884 issue: 14 year: 2021 ident: 2024100217173473200_bib18 article-title: Analogy training multilingual encoders publication-title: Proceedings of the AAAI Conference on Artificial Intelligence doi: 10.1609/aaai.v35i14.17524 – year: 2021 ident: 2024100217173473200_bib20 article-title: Thinking ahead: Spontaneous prediction in context as a keystone of language in humans and machines publication-title: bioRxiv doi: 10.1101/2020.12.02.403477 – year: 2019 ident: 2024100217173473200_bib47 article-title: Language models are unsupervised multitask learners – volume: 193 start-page: 217 year: 2012 ident: 2024100217173473200_bib42 article-title: Babelnet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network publication-title: Artificial Intelligence doi: 10.1016/j.artint.2012.07.001 – volume: 1 start-page: 54 issue: 1 year: 2020 ident: 2024100217173473200_bib51 article-title: Traces of meaning itself: Encoding distributional word vectors in brain activity publication-title: Neurobiology of Language doi: 10.1162/nol_a_00003 – volume-title: On the limitations of unsupervised bilingual dictionary induction year: 2018 ident: 2024100217173473200_bib57 doi: 10.18653/v1/P18-1072 – start-page: 7548 volume-title: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics year: 2020 ident: 2024100217173473200_bib19 article-title: Non-linear instance-based cross-lingual mapping for non-isomorphic embedding spaces doi: 10.18653/v1/2020.acl-main.675 – start-page: 1403 volume-title: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) year: 2014 ident: 2024100217173473200_bib30 article-title: Is this a wampimuk? Cross-modal mapping between distributional semantics and the visual world doi: 10.3115/v1/P14-1132 – start-page: 1 year: 2022 ident: 2024100217173473200_bib2 article-title: Predictive coding or just feature discovery? an alternative account of why language models fit brain data publication-title: Neurobiology of Language doi: 10.1162/nol_a_00087 – start-page: 16000 volume-title: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition year: 2022 ident: 2024100217173473200_bib24 article-title: Masked autoencoders are scalable vision learners doi: 10.1109/CVPR52688.2022.01553 – year: 2022 ident: 2024100217173473200_bib11 article-title: Long-range and hierarchical language predictions in brains and algorithms publication-title: Nature Human Behaviour doi: 10.48550/arXiv.2111.14232 – start-page: 4171 volume-title: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) year: 2019 ident: 2024100217173473200_bib14 article-title: BERT: Pre-training of deep bidirectional transformers for language understanding – volume: 58 start-page: 89 issue: 1 year: 2004 ident: 2024100217173473200_bib7 article-title: Intuition pumps and the proper use of thought experiments publication-title: Dialectica doi: 10.1111/j.1746-8361.2004.tb00293.x – start-page: 9960 volume-title: Advances in Neural Information Processing Systems year: 2020 ident: 2024100217173473200_bib43 article-title: Self-supervised learning through the eyes of a child – year: 2022 ident: 2024100217173473200_bib21 article-title: The combination of hebbian and predictive plasticity learns invariant object representations in deep sensory networks publication-title: bioRxiv doi: 10.1101/2022.03.17.484712 – volume: 117 start-page: 30046 issue: 48 year: 2020 ident: 2024100217173473200_bib34 article-title: Emergent linguistic structure in artificial neural networks trained by self-supervision publication-title: Proceedings of the National Academy of Sciences doi: 10.1073/pnas.1907367117 – volume: 31 start-page: 1 issue: 1 year: 1966 ident: 2024100217173473200_bib52 article-title: A generalized solution of the orthogonal procrustes problem publication-title: Psychometrika doi: 10.1007/BF02289451 – start-page: 38 volume-title: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations year: 2020 ident: 2024100217173473200_bib65 article-title: Transformers: State-of-the-art natural language processing doi: 10.18653/v1/2020.emnlp-demos.6 – year: 2023 ident: 2024100217173473200_bib60 article-title: Llama 2: Open foundation and fine-tuned chat models publication-title: arXiv preprint arXiv:2307.09288 – start-page: 1764 volume-title: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence - Volume Volume Three year: 2011 ident: 2024100217173473200_bib5 article-title: Learning bilingual lexicons using the visual similarity of labeled web images – volume: 4 issue: 682578 year: 2021 ident: 2024100217173473200_bib50 article-title: The singleton fallacy: Why current critiques of language models miss the point publication-title: Frontiers in Artificial Intelligence doi: 10.3389/frai.2021.682578 – start-page: 109 volume-title: Proceedings of the 25th Conference on Computational Natural Language Learning year: 2021 ident: 2024100217173473200_bib1 article-title: Can language models encode perceptual structure without grounding? A case study in color doi: 10.18653/v1/2021.conll-1.9 – year: 2023 ident: 2024100217173473200_bib36 article-title: A sentence is worth a thousand pictures: Can large language models understand human language? – volume-title: Proceedings of ICLR 2018 year: 2018 ident: 2024100217173473200_bib13 article-title: Word translation without parallel data – start-page: 146 volume-title: Proceedings of BigScience Episode #5 – Workshop on Challenges & Perspectives in CreatingLarge Language Models year: 2022 ident: 2024100217173473200_bib58 article-title: Emergent structures and training dynamics in large language models doi: 10.18653/v1/2022.bigscience-1.11 – year: 2017 ident: 2024100217173473200_bib69 article-title: Scene parsing through ade20k dataset publication-title: Computer Vision and Pattern Recognition doi: 10.1109/CVPR.2017.544 – start-page: 159 volume-title: Proceedings of The Third Workshop on Representation Learning for NLP year: 2018 ident: 2024100217173473200_bib23 article-title: Limitations of cross-lingual learning from image search doi: 10.18653/v1/W18-3021 – year: 2018 ident: 2024100217173473200_bib54 article-title: Brain-score: Which artificial neural network for object recognition is most brain-like? publication-title: bioRxiv doi: 10.1101/407007 – volume-title: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3–7, 2021 year: 2021 ident: 2024100217173473200_bib15 article-title: An image is worth 16x16 words: Transformers for image recognition at scale – year: 2021 ident: 2024100217173473200_bib53 article-title: The neural architecture of language: Integrative modeling converges on predictive processing publication-title: bioRxiv doi: 10.1073/pnas.2105646118 – start-page: 36 volume-title: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) year: 2014 ident: 2024100217173473200_bib28 article-title: Learning image embeddings using convolutional neural networks for improved multi-modal semantics doi: 10.3115/v1/D14-1005 – start-page: 5185 volume-title: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics year: 2020 ident: 2024100217173473200_bib4 article-title: Climbing towards NLU: On meaning, form, and understanding in the age of data doi: 10.18653/v1/2020.acl-main.463 – volume-title: Representation in Cognitive Science year: 2018 ident: 2024100217173473200_bib56 doi: 10.1093/oso/9780198812883.001.0001 – start-page: 9650 volume-title: Proceedings of the IEEE/CVF International Conference on Computer Vision year: 2021 ident: 2024100217173473200_bib10 article-title: Emerging properties in self-supervised vision transformers doi: 10.1109/ICCV48922.2021.00951 – start-page: 512 volume-title: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing year: 2018 ident: 2024100217173473200_bib41 article-title: NORMA: Neighborhood sensitive maps for multilingual word embeddings doi: 10.18653/v1/D18-1047 – year: 2023 ident: 2024100217173473200_bib71 article-title: Representation engineering: A top-down approach to ai transparency publication-title: arXiv preprint arXiv:2310.01405 – volume: 32 year: 2019 ident: 2024100217173473200_bib59 article-title: Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain) publication-title: Advances in Neural Information Processing Systems – start-page: 1 year: 2021 ident: 2024100217173473200_bib8 article-title: Sharing our concepts with machines publication-title: Erkenntnis doi: 10.1007/s10670-021-00491-w – volume: 12 start-page: 3 issue: 1 year: 2002 ident: 2024100217173473200_bib48 article-title: Holism, conceptual- role semantics, and syntactic semantics publication-title: Minds and Machines doi: 10.1023/a:1013765011735 – start-page: 188 volume-title: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) year: 2016 ident: 2024100217173473200_bib62 article-title: Multi-modal representations for improved bilingual lexicon learning doi: 10.18653/v1/P16-2031 – start-page: 582 volume-title: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing year: 2018 ident: 2024100217173473200_bib22 article-title: Why is unsupervised alignment of English embeddings from different algorithms so hard? doi: 10.18653/v1/D18-1056 – year: 2022 ident: 2024100217173473200_bib63 article-title: Emergent abilities of large language models publication-title: Transactions on Machine Learning Research – year: 2023 ident: 2024100217173473200_bib40 article-title: The vector grounding problem – volume-title: Lexical Competence year: 1997 ident: 2024100217173473200_bib35 – start-page: 8024 volume-title: Advances in Neural Information Processing Systems 32 year: 2019 ident: 2024100217173473200_bib44 article-title: Pytorch: An imperative style, high-performance deep learning library – ident: 2024100217173473200_bib17 – volume-title: ACL year: 2018 ident: 2024100217173473200_bib3 article-title: A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings doi: 10.18653/v1/P18-1073 – volume-title: The Eleventh International Conference on Learning Representations year: 2023 ident: 2024100217173473200_bib37 article-title: Linearly mapping from image to text space – year: 2022 ident: 2024100217173473200_bib67 article-title: Opt: Open pre-trained transformer language models – year: 2019 ident: 2024100217173473200_bib61 article-title: Well-read students learn better: On the importance of pre-training compact models publication-title: arXiv preprint arXiv:1908.08962v2 – start-page: 1801.06126 volume-title: CoRR year: 2018 ident: 2024100217173473200_bib26 article-title: An iterative closest point method for unsupervised word translation doi: 10.18653/v1/D18-1043 – volume: 81 start-page: 553 issue: 4 year: 1998 ident: 2024100217173473200_bib32 article-title: Stepping back inside Leibniz’s mill publication-title: The Monist doi: 10.5840/monist199881427 |
SSID | ssj0001818062 |
Score | 2.276172 |
Snippet | Large-scale pretrained language models (LMs) are said to “lack the ability to connect utterances to the world” (Bender and Koller,
), because they do not have... Large-scale pretrained language models (LMs) are said to “lack the ability to connect utterances to the world” (Bender and Koller, 2020), because they do not... |
SourceID | doaj crossref mit |
SourceType | Open Website Enrichment Source Index Database Publisher |
StartPage | 1232 |
Title | Do Vision and Language Models Share Concepts? A Vector Space Alignment Study |
URI | https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00698 https://doaj.org/article/ee8d0e6d68d9476e85abf2f725793cda |
Volume | 12 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LSwMxEA5SL15EUbE-SgQ9ydJ95bEnaWtLkVpEa-ktJNmJCEsrdj34702y21JB8eI1DGSZmTDzzc58g9Cl49gPJUkCbZ0pSKNIBhkH965YFGkVmUy5aeT7MR0-p3czMttY9eV6wip64EpxbQCeh0BzyvMsZRQ4kcrEhllXyxKd-9QozMINMOWrK26EmcarTncat0upCyGFY-bl32KQp-q3kWU9Pu8jy2AP7dYpIe5Un7KPtmB-gEa3Czz1c9_YYn08qsuK2O0uK5bY8SwD7lUzh8sb3MFTX37HTxYDA-4Ury_-Nz92fYKfh2gy6E96w6DefGBVxuIykMC0xSEyjTUHm6AxpZhhNKLKcKlNQkKpcnuUgkUEiisJxjHDS64g1JokR6gxX8zhGGGtVQyZ0jQjLJU6svBKGpJKl1kRAqyJrleqELpmBXfLKQrh0QGNxabimuhqLf1WsWH8Itd1Wl3LOA5rf2AtK2rLir8s20QX1iaiflPLHy86-Y-LTtFObFOVqgvkDDXK9w84t6lGqVpou9sfPzy2vHd9Ac-D1KA |
linkProvider | Directory of Open Access Journals |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEB60HvQiior1GUFPstps81oQpL6ote3FKt5Ckp0tQm3F1oP_3mR3W1QUvGZn2WXymEfm-wbgMHDs1wyvR84vpohRaqJEYdhXklJnaZbYgEbudEXzgbWe-NMcnE2xMMVBfvLyXFTRTIwbnJY6nJENUBHnD7TRgWhXzcOCYJTxCiy0Ot3WlxxLADKLeFrv_uO1b5YoJ-z39mUGos_ty80KLJeOIWkUf7EKczhcg_bViDzm6G_iI37SLpOLJHQwG4xJYFtGclkgD8fnpEEe8yQ8ufeRMJLG4LmfX_aTUC34sQ69m-veZTMq-x94xcl4EhmUzkcjhsVOoXfTpLUyk4IKmynjsjqvGZv6IYY-LrDKGswCP7xRFmvO8foGVIajIW4Ccc7GmFgnEi6ZcdQHWSbjzAT_inOUVTieqkK7khs8tKgY6DxGELH-qrgqHM2kXwtOjD_kLoJWZzKByTofGL31dTmpGlGlNRSpUGnCpEDFjc3iTPqjJKm71FThwM-JLnfW-NcPbf1DZh8Wm71OW7dvu3fbsBR736Qo-9iByuTtHXe9bzGxe-US-gRkPcz7 |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Do+Vision+and+Language+Models+Share+Concepts%3F+A+Vector+Space+Alignment+Study&rft.jtitle=Transactions+of+the+Association+for+Computational+Linguistics&rft.au=Li%2C+Jiaang&rft.au=Kementchedjhieva%2C+Yova&rft.au=Fierro%2C+Constanza&rft.au=S%C3%B8gaard%2C+Anders&rft.date=2024-09-30&rft.pub=MIT+Press&rft.eissn=2307-387X&rft.volume=12&rft.spage=1232&rft.epage=1249&rft_id=info:doi/10.1162%2Ftacl_a_00698&rft.externalDBID=n%2Fa&rft.externalDocID=tacl_a_00698.pdf |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2307-387X&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2307-387X&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2307-387X&client=summon |