Semantic Relata for the Evaluation of Distributional Models in Mandarin Chinese

Distributional Semantic Models (DSMs) established themselves as a standard for the representation of word and sentence meaning. However, DSMs provide quantitative measurement of how strongly two linguistic expressions are related, without being able to automatically classify different semantic relat...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 7; p. 1
Main Authors Liu, Hongchao, Chersoni, Emmanuele, Klyueva, Natalia, Santus, Enrico, Huang, Chu-Ren
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 01.01.2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Distributional Semantic Models (DSMs) established themselves as a standard for the representation of word and sentence meaning. However, DSMs provide quantitative measurement of how strongly two linguistic expressions are related, without being able to automatically classify different semantic relations. Hence the notion of semantic similarity is underspecified in DSMs. We introduce Evalution-MAN in this paper as an effort to address this underspecification problem. Following the EVALution 1.0 dataset for English, we present a dataset for evaluating DSMs on the task of the identification of semantic relations in Mandarin Chinese. Moreover, we test different types of word vectors on the automatic learning of these semantic relations, and we evaluate them both in a unsupervised and in a supervised setting, finding that distributional models tend, in general, to assign higher similarity scores to synonyms and that deep learning classifiers are the best performing ones in the identification of semantic relations.
AbstractList Distributional Semantic Models (DSMs) established themselves as a standard for the representation of word and sentence meaning. However, DSMs provide quantitative measurement of how strongly two linguistic expressions are related, without being able to automatically classify different semantic relations. Hence the notion of semantic similarity is underspecified in DSMs. We introduce Evalution-MAN in this paper as an effort to address this underspecification problem. Following the EVALution 1.0 dataset for English, we present a dataset for evaluating DSMs on the task of the identification of semantic relations in Mandarin Chinese. Moreover, we test different types of word vectors on the automatic learning of these semantic relations, and we evaluate them both in a unsupervised and in a supervised setting, finding that distributional models tend, in general, to assign higher similarity scores to synonyms and that deep learning classifiers are the best performing ones in the identification of semantic relations.
Author Klyueva, Natalia
Huang, Chu-Ren
Liu, Hongchao
Chersoni, Emmanuele
Santus, Enrico
Author_xml – sequence: 1
  givenname: Hongchao
  surname: Liu
  fullname: Liu, Hongchao
  organization: School of Literature, Shandong University, China
– sequence: 2
  givenname: Emmanuele
  surname: Chersoni
  fullname: Chersoni, Emmanuele
  organization: Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong, China
– sequence: 3
  givenname: Natalia
  surname: Klyueva
  fullname: Klyueva, Natalia
  organization: Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong, China
– sequence: 4
  givenname: Enrico
  surname: Santus
  fullname: Santus, Enrico
  organization: Electrical Engineering and Computer Science, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, USA
– sequence: 5
  givenname: Chu-Ren
  surname: Huang
  fullname: Huang, Chu-Ren
  organization: Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong, China
BookMark eNpNUU1rFEEQbSSCMeYX5NLgedf-_jiGcdVAQsDVc1MzU216mUzH7lkh_95eJwTrUlWPeq94vPfkbM4zEnLF2ZZz5j9dd91uv98Kxv1WeKWZ4W_IueDGb6SW5uy_-R25rPXAWrkGaXtO7vf4CPOSBvodJ1iAxlzo8oB09wemIywpzzRH-jnVpaT-eNphond5xKnSNNM7mEcobege0owVP5C3EaaKly_9gvz8svvRfdvc3n-96a5vN4PUbtkMjKloQXIVo3Te-QjoAJgdrGEjaESL_ag8wyhH13shjNImNsQi9DbKC3Kz6o4ZDuGppEcozyFDCv-AXH4FKM3WhMENTqPoFdghKiF6D5xLpg044UdjedP6uGo9lfz7iHUJh3wszWcNQmltpOLKtCu5Xg0l11owvn7lLJyCCGsQ4RREeAmisa5WVkLEV4ZzWlnv5F8SA4V7
CODEN IAECCG
Cites_doi 10.18653/v1/W15-4208
10.1017/CBO9780511676536.012
10.1080/00437956.1954.11659520
10.18653/v1/P16-1226
10.3115/v1/E14-4008
10.3115/v1/D14-1162
10.1002/aris.1440400112
10.18653/v1/P16-2074
10.18653/v1/E17-1007
10.7551/mitpress/7287.003.0011
10.17791/jcs.2015.16.4.431
10.18653/v1/D17-1022
10.18653/v1/N18-1202
10.1162/tacl_a_00051
10.1016/j.jml.2016.04.001
10.3115/v1/N15-1098
10.1162/coli.2006.32.1.13
10.1017/S1351324910000124
10.1007/978-3-319-49508-8_25
10.18653/v1/D16-1234
10.1162/089120106776173075
10.7551/mitpress/7287.001.0001
10.3758/BF03198427
10.1613/jair.2934
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019
DBID 97E
ESBDL
RIA
RIE
AAYXX
CITATION
7SC
7SP
7SR
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
DOA
DOI 10.1109/ACCESS.2019.2945061
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005-present
IEEE Xplore Open Access Journals
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library Online
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Engineered Materials Abstracts
METADEX
Technology Research Database
Materials Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Directory of Open Access Journals
DatabaseTitle CrossRef
Materials Research Database
Engineered Materials Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
METADEX
Computer and Information Systems Abstracts Professional
DatabaseTitleList

Materials Research Database
Database_xml – sequence: 1
  dbid: DOA
  name: Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: RIE
  name: IEEE Electronic Library Online
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2169-3536
EndPage 1
ExternalDocumentID oai_doaj_org_article_8c85e2b4a7cf422b9a113056a829d671
10_1109_ACCESS_2019_2945061
8854798
Genre orig-research
GroupedDBID 0R~
5VS
6IK
97E
AAJGR
ABVLG
ACGFS
ADBBV
ALMA_UNASSIGNED_HOLDINGS
BCNDV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
ESBDL
GROUPED_DOAJ
IFIPE
IPLJI
JAVBF
KQ8
M43
M~E
O9-
OCL
OK1
RIA
RIE
RIG
RNS
4.4
AAYXX
CITATION
7SC
7SP
7SR
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c358t-c004f7a314ff38989fae8aa07c760da5ee7ebd490ef3d8b9226456fd497eab7f3
IEDL.DBID DOA
ISSN 2169-3536
IngestDate Tue Oct 22 15:15:21 EDT 2024
Thu Oct 10 17:17:28 EDT 2024
Wed Jul 31 12:44:55 EDT 2024
Wed Jun 26 19:27:49 EDT 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c358t-c004f7a314ff38989fae8aa07c760da5ee7ebd490ef3d8b9226456fd497eab7f3
ORCID 0000-0001-8742-0451
0000-0001-7559-8391
0000-0002-8526-5520
OpenAccessLink https://doaj.org/article/8c85e2b4a7cf422b9a113056a829d671
PQID 2455634146
PQPubID 4845423
PageCount 1
ParticipantIDs ieee_primary_8854798
doaj_primary_oai_doaj_org_article_8c85e2b4a7cf422b9a113056a829d671
proquest_journals_2455634146
crossref_primary_10_1109_ACCESS_2019_2945061
PublicationCentury 2000
PublicationDate 2019-01-01
PublicationDateYYYYMMDD 2019-01-01
PublicationDate_xml – month: 01
  year: 2019
  text: 2019-01-01
  day: 01
PublicationDecade 2010
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE access
PublicationTitleAbbrev Access
PublicationYear 2019
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref12
ref11
ref10
cruse (ref19) 1986
santus (ref29) 2016
ref16
huang (ref39) 2010; 24
pedregosa (ref44) 2011; 12
sproat (ref47) 2000
devlin (ref50) 2019
santus (ref22) 2014
lenci (ref28) 2012
santus (ref32) 2016
huang (ref41) 2009
ref48
liu (ref15) 2016
ref42
roller (ref31) 2014
hearst (ref25) 1998
zock (ref13) 2016
ref49
ref8
baroni (ref17) 2011
mikolov (ref7) 2013
baroni (ref30) 2012
baroni (ref9) 2014
ref6
ref40
weeds (ref18) 2014
lenci (ref5) 2008; 20
ma (ref4) 2006
ref35
santus (ref14) 2016
ref34
ref36
ref33
ref2
ref1
ref38
chollet (ref45) 2015
chen (ref3) 1996
ref24
ref23
ref26
ref20
ref21
vulic (ref37) 2018
ref27
huang (ref46) 2015
chersoni (ref43) 2016
References_xml – ident: ref16
  doi: 10.18653/v1/W15-4208
– ident: ref11
  doi: 10.1017/CBO9780511676536.012
– year: 2016
  ident: ref14
  article-title: Making sense: From word distribution to meaning
  contributor:
    fullname: santus
– ident: ref2
  doi: 10.1080/00437956.1954.11659520
– ident: ref33
  doi: 10.18653/v1/P16-1226
– start-page: 23
  year: 2012
  ident: ref30
  article-title: Entailment above the word level in distributional semantics
  publication-title: Proc EACL
  contributor:
    fullname: baroni
– start-page: 2249
  year: 2014
  ident: ref18
  article-title: Learning to distinguish hypernyms and co-hyponyms
  publication-title: Proc COLING
  contributor:
    fullname: weeds
– ident: ref23
  doi: 10.3115/v1/E14-4008
– start-page: 1025
  year: 2014
  ident: ref31
  article-title: Inclusive yet selective: Supervised distributional hypernymy detection
  publication-title: Proc COLING
  contributor:
    fullname: roller
– ident: ref8
  doi: 10.3115/v1/D14-1162
– start-page: 75
  year: 2012
  ident: ref28
  article-title: Identifying hypernyms in distributional semantic spaces
  publication-title: STAR Proceedings
  contributor:
    fullname: lenci
– year: 2013
  ident: ref7
  article-title: Efficient estimation of word representations in vector space
  publication-title: arXiv 1301 3781 [cs]
  contributor:
    fullname: mikolov
– ident: ref21
  doi: 10.1002/aris.1440400112
– start-page: 4557
  year: 2016
  ident: ref32
  article-title: Nine features in a random forest to learn taxonomical semantic relations
  publication-title: Proc LREC
  contributor:
    fullname: santus
– ident: ref35
  doi: 10.18653/v1/P16-2074
– volume: 20
  start-page: 1
  year: 2008
  ident: ref5
  article-title: Distributional semantics in linguistic and cognitive research
  publication-title: Italian Linguistics
  contributor:
    fullname: lenci
– year: 2000
  ident: ref47
  publication-title: A Computational Theory of Writing Systems
  contributor:
    fullname: sproat
– start-page: 4171
  year: 2019
  ident: ref50
  article-title: BERT: Pre-training of deep bidirectional transformers for language understanding
  publication-title: Proc NAACL
  contributor:
    fullname: devlin
– ident: ref24
  doi: 10.18653/v1/E17-1007
– volume: 24
  start-page: 14
  year: 2010
  ident: ref39
  article-title: Chinese wordnet: Design, implementation, and application of an infrastructure for cross-lingual knowledge processing
  publication-title: J Chin Inf Process
  contributor:
    fullname: huang
– start-page: 131
  year: 1998
  ident: ref25
  article-title: Automated discovery of wordnet relations
  publication-title: WordNet An Electronical Lexical Database
  doi: 10.7551/mitpress/7287.003.0011
  contributor:
    fullname: hearst
– ident: ref27
  doi: 10.17791/jcs.2015.16.4.431
– start-page: 24
  year: 2006
  ident: ref4
  article-title: Uniform and effective tagging of a heterogeneous Giga-Word corpus
  publication-title: Proc LREC
  contributor:
    fullname: ma
– year: 1986
  ident: ref19
  publication-title: Lexical Semantics
  contributor:
    fullname: cruse
– start-page: 4583
  year: 2016
  ident: ref15
  article-title: EVALution-MAN: A chinese dataset for the training and evaluation of DSMs
  publication-title: Proc LREC
  contributor:
    fullname: liu
– ident: ref36
  doi: 10.18653/v1/D17-1022
– ident: ref49
  doi: 10.18653/v1/N18-1202
– year: 2015
  ident: ref45
  publication-title: Keras Github
  contributor:
    fullname: chollet
– ident: ref40
  doi: 10.1162/tacl_a_00051
– volume: 12
  start-page: 2825
  year: 2011
  ident: ref44
  article-title: Scikit-learn: Machine learning in Python
  publication-title: J Mach Learn Res
  contributor:
    fullname: pedregosa
– ident: ref10
  doi: 10.1016/j.jml.2016.04.001
– ident: ref48
  doi: 10.3115/v1/N15-1098
– year: 2009
  ident: ref41
  publication-title: Tagged Chinese Gigaword Version 2 0
  contributor:
    fullname: huang
– ident: ref12
  doi: 10.1162/coli.2006.32.1.13
– start-page: iii
  year: 2016
  ident: ref13
  publication-title: Proc 5th Workshop Cogn Aspects Lexicon (CogALex-V)
  contributor:
    fullname: zock
– start-page: 238
  year: 2014
  ident: ref9
  article-title: Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors
  publication-title: Proc ACL
  contributor:
    fullname: baroni
– start-page: 1134
  year: 2018
  ident: ref37
  article-title: Specialising word vectors for lexical entailment
  publication-title: Proc NAACL
  contributor:
    fullname: vulic
– ident: ref42
  doi: 10.1017/S1351324910000124
– ident: ref38
  doi: 10.1007/978-3-319-49508-8_25
– start-page: 1
  year: 2011
  ident: ref17
  article-title: How we BLESSed distributional semantic evaluation
  publication-title: Proceedings of the Workshop on Geometrical Models of Natural Language Semantics
  contributor:
    fullname: baroni
– start-page: 167
  year: 1996
  ident: ref3
  article-title: Sinica corpus: Design methodology for balanced corpora
  publication-title: Proc Pacific Asia Conf Lang Inf Comput
  contributor:
    fullname: chen
– ident: ref34
  doi: 10.18653/v1/D16-1234
– start-page: 1
  year: 2016
  ident: ref29
  article-title: What a nerd! beating students and vector cosine in the ESL and TOEFL datasets
  publication-title: Proc LREC
  contributor:
    fullname: santus
– start-page: 135
  year: 2014
  ident: ref22
  article-title: Taking antonymy mask off in Vector Space
  publication-title: Proc of PACLIC17
  contributor:
    fullname: santus
– start-page: 98
  year: 2016
  ident: ref43
  article-title: CogALex-V shared task: ROOT18
  publication-title: Proc COLING Workshop Cognit Aspects Lexicon
  contributor:
    fullname: chersoni
– ident: ref26
  doi: 10.1162/089120106776173075
– ident: ref20
  doi: 10.7551/mitpress/7287.001.0001
– ident: ref1
  doi: 10.3758/BF03198427
– ident: ref6
  doi: 10.1613/jair.2934
– start-page: 290
  year: 2015
  ident: ref46
  article-title: Chinese lexical semantics
  publication-title: The Oxford Handbook of Chinese Linguistics
  contributor:
    fullname: huang
SSID ssj0000816957
Score 2.186302
Snippet Distributional Semantic Models (DSMs) established themselves as a standard for the representation of word and sentence meaning. However, DSMs provide...
SourceID doaj
proquest
crossref
ieee
SourceType Open Website
Aggregation Database
Publisher
StartPage 1
SubjectTerms Chinese languages
Computational modeling
Computational Semantics
Data mining
Datasets
Evaluation
Lexical Resources
Linguistics
Natural language processing
Ontologies
Relation Classification
Semantic Relations
Semantics
Similarity
Task analysis
Vector Space Models
Words (language)
SummonAdditionalLinks – databaseName: IEEE Electronic Library Online
  dbid: RIE
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9wgEEZJTsmhjzzUbdOKQ47xxmCex3SbKKq0zaGNlBsCPEhR292ou3vpry-DWSt9HHpDyMaYD8wwnvk-Qs4YyKBSSA3TbWyEDW0TVKeaYHsGKUmve0xwnn9SN3fi47283yHnYy4MAJTgM5hisfzL75dxg66yC2Ok0Nbskl1t7ZCrNfpTUEDCSl2JhVhrLy5ns_wOGL1lp9wK2Sr22-ZTOPqrqMpfX-KyvVw_J_Ntx4aokq_TzTpM488_OBv_t-cvyLNqZ9LLYWK8JDuwOCQHT9gHj8jtZ_iex_Uh0hIR52m2X2m2B-nVyABOl4l-QGrdqoqVW0TxtG8r-rCg8-KEyAXU4IYVHJO766svs5um6is0sZNm3cS8QJL2HRMpdSgjmTwY71sdtWp7LwE0hF7YFlLXm2Ax51aqlGs0-KBTd0L2FssFvCKUIbGgEl6z5IXmyWarKPXAhc8WheJsQs63A-8eBxoNV44frXUDTg5xchWnCXmP4IyXIgd2qciD6uqSciYaCTzkh8YkOA_WM4YHIm-47ZXOjRwhEGMjFYMJOd1C7ep6XTkukChN5G3j9b_vekP2sYOD8-WU7K1_bOBtNkfW4V2Zh78AlbTdRw
  priority: 102
  providerName: IEEE
Title Semantic Relata for the Evaluation of Distributional Models in Mandarin Chinese
URI https://ieeexplore.ieee.org/document/8854798
https://www.proquest.com/docview/2455634146
https://doaj.org/article/8c85e2b4a7cf422b9a113056a829d671
Volume 7
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3PT8MgGCVmJz0YdRqr03DwaF2h_DzOuWUxmR50yW4EWkiWaGfc_P8FypYZD168NbSh5X0U3kfgPQBukKWGOeNyxIsqJ9IUuWEly42skXWOal6HA87TJzaZkcc5ne9YfYU9Ya08cAtcX1SCWmyI5pUjGBupEQq0Vwssa8bbxKeQO8lUHIMFYpLyJDPk7_cHw6FvUdjLJe-wJLRg6MdUFBX7k8XKr3E5TjbjI3CYWCIctF93DPZscwIOdrQDu-D5xb57VBYVjPvZNPTsE3o2B0db_W64dPAhCOMmTytfY7A-e1vBRQOncQnBXwQHbbuyp2A2Hr0OJ3lyR8irkop1Xvnu7bguEXGuDCaQTluhdcErzopaU2u5NTWRhXVlLYwMJ2Ypc76EW224K89Ap1k29hxAFGQBmYcYOU04dtJzGldbTLTnAwyjDNxugFIfrQiGislDIVWLqwq4qoRrBu4DmNtHg4J1LPBxVSmu6q-4ZqAbQrGtRAhKuBQZ6G1Co9LftlKYBJkz4gf9i_949SXYD81pF1p6oLP-_LJXnnqszXXsZdfxlOA3ROnT6A
link.rule.ids 315,783,787,799,867,2109,27936,27937,55086
linkProvider Directory of Open Access Journals
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Nb9QwELVKOUAPFCiILQV84NhsY8efx7K0WqBbDrRSb5adjKWqsFuxuxd-fT2ON6LAgZtlJY7jZ8fjycx7hLxnIIOKIVZM120lbKiroBpVBdsxiFF63WGC8-xcTS_F5yt5tUUOh1wYAMjBZzDGYv6X3y3aNbrKjoyRQlvzgDxMdrVRfbbW4FFBCQkrdaEWYrU9Op5M0ltg_JYdcytkrdi97Sez9BdZlb--xXmDOd0ls03X-riSm_F6Fcbtrz9YG_-370_Jk2Jp0uN-ajwjWzB_TnZ-4x_cI1-_wY80stctzTFxniYLliaLkJ4MHOB0EelHJNctulipRZRP-76k13M6y26IVEAVbljCC3J5enIxmVZFYaFqG2lWVZuWSNS-YSLGBoUkowfjfa1brerOSwANoRO2hth0JljMupUqphoNPujYvCTb88UcXhHKkFpQCa9Z9ELzaJNdFDvgwiebQnE2IoebgXe3PZGGyweQ2roeJ4c4uYLTiHxAcIZLkQU7V6RBdWVROdMaCTykh7ZRcB6sZwyPRN5w2ymdGtlDIIZGCgYjcrCB2pUVu3RcIFWaSBvH_r_vekceTS9mZ-7s0_mX1-QxdrZ3xRyQ7dXPNbxJxskqvM1z8g5OZ-CS
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Semantic+Relata+for+the+Evaluation+of+Distributional+Models+in+Mandarin+Chinese&rft.jtitle=IEEE+access&rft.au=Liu%2C+Hongchao&rft.au=Chersoni%2C+Emmanuele&rft.au=Klyueva%2C+Natalia&rft.au=Santus%2C+Enrico&rft.date=2019-01-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.eissn=2169-3536&rft.volume=7&rft.spage=145705&rft_id=info:doi/10.1109%2FACCESS.2019.2945061&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon