Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation

Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or too focused on specific types of entities (e.g. diseases or genes). We present a method that can be used to automatically develop a WSD test collection...

Full description

Saved in:

Bibliographic Details
Published in	BMC bioinformatics Vol. 12; no. 1; p. 223
Main Authors	Jimeno-Yepes, Antonio J, McInnes, Bridget T, Aronson, Alan R
Format	Journal Article
Language	English
Published	England BioMed Central Ltd 02.06.2011 BioMed Central BMC
Subjects	Abstracting and Indexing as Topic Algorithms Humans Intermedin Knowledge Bases Medical Subject Headings MEDLINE Natural Language Processing Online health care information services Physiological aspects Semantics Unified Medical Language System United States United States
Online Access	Get full text

Cover

Loading…

Abstract	Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or too focused on specific types of entities (e.g. diseases or genes). We present a method that can be used to automatically develop a WSD test collection using the Unified Medical Language System (UMLS) Metathesaurus and the manual MeSH indexing of MEDLINE. We demonstrate the use of this method by developing such a data set, called MSH WSD. In our method, the Metathesaurus is first screened to identify ambiguous terms whose possible senses consist of two or more MeSH headings. We then use each ambiguous term and its corresponding MeSH heading to extract MEDLINE citations where the term and only one of the MeSH headings co-occur. The term found in the MEDLINE citation is automatically assigned the UMLS CUI linked to the MeSH heading. Each instance has been assigned a UMLS Concept Unique Identifier (CUI). We compare the characteristics of the MSH WSD data set to the previously existing NLM WSD data set. The resulting MSH WSD data set consists of 106 ambiguous abbreviations, 88 ambiguous terms and 9 which are a combination of both, for a total of 203 ambiguous entities. For each ambiguous term/abbreviation, the data set contains a maximum of 100 instances per sense obtained from MEDLINE.We evaluated the reliability of the MSH WSD data set using existing knowledge-based methods and compared their performance to that of the results previously obtained by these algorithms on the pre-existing data set, NLM WSD. We show that the knowledge-based methods achieve different results but keep their relative performance except for the Journal Descriptor Indexing (JDI) method, whose performance is below the other methods. The MSH WSD data set allows the evaluation of WSD algorithms in the biomedical domain. Compared to previously existing data sets, MSH WSD contains a larger number of biomedical terms/abbreviations and covers the largest set of UMLS Semantic Types. Furthermore, the MSH WSD data set has been generated automatically reusing already existing annotations and, therefore, can be regenerated from subsequent UMLS versions.
AbstractList	Abstract Background Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or too focused on specific types of entities (e.g. diseases or genes). We present a method that can be used to automatically develop a WSD test collection using the Unified Medical Language System (UMLS) Metathesaurus and the manual MeSH indexing of MEDLINE. We demonstrate the use of this method by developing such a data set, called MSH WSD. Methods In our method, the Metathesaurus is first screened to identify ambiguous terms whose possible senses consist of two or more MeSH headings. We then use each ambiguous term and its corresponding MeSH heading to extract MEDLINE citations where the term and only one of the MeSH headings co-occur. The term found in the MEDLINE citation is automatically assigned the UMLS CUI linked to the MeSH heading. Each instance has been assigned a UMLS Concept Unique Identifier (CUI). We compare the characteristics of the MSH WSD data set to the previously existing NLM WSD data set. Results The resulting MSH WSD data set consists of 106 ambiguous abbreviations, 88 ambiguous terms and 9 which are a combination of both, for a total of 203 ambiguous entities. For each ambiguous term/abbreviation, the data set contains a maximum of 100 instances per sense obtained from MEDLINE. We evaluated the reliability of the MSH WSD data set using existing knowledge-based methods and compared their performance to that of the results previously obtained by these algorithms on the pre-existing data set, NLM WSD. We show that the knowledge-based methods achieve different results but keep their relative performance except for the Journal Descriptor Indexing (JDI) method, whose performance is below the other methods. Conclusions The MSH WSD data set allows the evaluation of WSD algorithms in the biomedical domain. Compared to previously existing data sets, MSH WSD contains a larger number of biomedical terms/abbreviations and covers the largest set of UMLS Semantic Types. Furthermore, the MSH WSD data set has been generated automatically reusing already existing annotations and, therefore, can be regenerated from subsequent UMLS versions. Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or too focused on specific types of entities (e.g. diseases or genes). We present a method that can be used to automatically develop a WSD test collection using the Unified Medical Language System (UMLS) Metathesaurus and the manual MeSH indexing of MEDLINE. We demonstrate the use of this method by developing such a data set, called MSH WSD.BACKGROUNDEvaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or too focused on specific types of entities (e.g. diseases or genes). We present a method that can be used to automatically develop a WSD test collection using the Unified Medical Language System (UMLS) Metathesaurus and the manual MeSH indexing of MEDLINE. We demonstrate the use of this method by developing such a data set, called MSH WSD.In our method, the Metathesaurus is first screened to identify ambiguous terms whose possible senses consist of two or more MeSH headings. We then use each ambiguous term and its corresponding MeSH heading to extract MEDLINE citations where the term and only one of the MeSH headings co-occur. The term found in the MEDLINE citation is automatically assigned the UMLS CUI linked to the MeSH heading. Each instance has been assigned a UMLS Concept Unique Identifier (CUI). We compare the characteristics of the MSH WSD data set to the previously existing NLM WSD data set.METHODSIn our method, the Metathesaurus is first screened to identify ambiguous terms whose possible senses consist of two or more MeSH headings. We then use each ambiguous term and its corresponding MeSH heading to extract MEDLINE citations where the term and only one of the MeSH headings co-occur. The term found in the MEDLINE citation is automatically assigned the UMLS CUI linked to the MeSH heading. Each instance has been assigned a UMLS Concept Unique Identifier (CUI). We compare the characteristics of the MSH WSD data set to the previously existing NLM WSD data set.The resulting MSH WSD data set consists of 106 ambiguous abbreviations, 88 ambiguous terms and 9 which are a combination of both, for a total of 203 ambiguous entities. For each ambiguous term/abbreviation, the data set contains a maximum of 100 instances per sense obtained from MEDLINE.We evaluated the reliability of the MSH WSD data set using existing knowledge-based methods and compared their performance to that of the results previously obtained by these algorithms on the pre-existing data set, NLM WSD. We show that the knowledge-based methods achieve different results but keep their relative performance except for the Journal Descriptor Indexing (JDI) method, whose performance is below the other methods.RESULTSThe resulting MSH WSD data set consists of 106 ambiguous abbreviations, 88 ambiguous terms and 9 which are a combination of both, for a total of 203 ambiguous entities. For each ambiguous term/abbreviation, the data set contains a maximum of 100 instances per sense obtained from MEDLINE.We evaluated the reliability of the MSH WSD data set using existing knowledge-based methods and compared their performance to that of the results previously obtained by these algorithms on the pre-existing data set, NLM WSD. We show that the knowledge-based methods achieve different results but keep their relative performance except for the Journal Descriptor Indexing (JDI) method, whose performance is below the other methods.The MSH WSD data set allows the evaluation of WSD algorithms in the biomedical domain. Compared to previously existing data sets, MSH WSD contains a larger number of biomedical terms/abbreviations and covers the largest set of UMLS Semantic Types. Furthermore, the MSH WSD data set has been generated automatically reusing already existing annotations and, therefore, can be regenerated from subsequent UMLS versions.CONCLUSIONSThe MSH WSD data set allows the evaluation of WSD algorithms in the biomedical domain. Compared to previously existing data sets, MSH WSD contains a larger number of biomedical terms/abbreviations and covers the largest set of UMLS Semantic Types. Furthermore, the MSH WSD data set has been generated automatically reusing already existing annotations and, therefore, can be regenerated from subsequent UMLS versions. Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or too focused on specific types of entities (e.g. diseases or genes). We present a method that can be used to automatically develop a WSD test collection using the Unified Medical Language System (UMLS) Metathesaurus and the manual MeSH indexing of MEDLINE. We demonstrate the use of this method by developing such a data set, called MSH WSD. In our method, the Metathesaurus is first screened to identify ambiguous terms whose possible senses consist of two or more MeSH headings. We then use each ambiguous term and its corresponding MeSH heading to extract MEDLINE citations where the term and only one of the MeSH headings co-occur. The term found in the MEDLINE citation is automatically assigned the UMLS CUI linked to the MeSH heading. Each instance has been assigned a UMLS Concept Unique Identifier (CUI). We compare the characteristics of the MSH WSD data set to the previously existing NLM WSD data set. The resulting MSH WSD data set consists of 106 ambiguous abbreviations, 88 ambiguous terms and 9 which are a combination of both, for a total of 203 ambiguous entities. For each ambiguous term/abbreviation, the data set contains a maximum of 100 instances per sense obtained from MEDLINE.We evaluated the reliability of the MSH WSD data set using existing knowledge-based methods and compared their performance to that of the results previously obtained by these algorithms on the pre-existing data set, NLM WSD. We show that the knowledge-based methods achieve different results but keep their relative performance except for the Journal Descriptor Indexing (JDI) method, whose performance is below the other methods. The MSH WSD data set allows the evaluation of WSD algorithms in the biomedical domain. Compared to previously existing data sets, MSH WSD contains a larger number of biomedical terms/abbreviations and covers the largest set of UMLS Semantic Types. Furthermore, the MSH WSD data set has been generated automatically reusing already existing annotations and, therefore, can be regenerated from subsequent UMLS versions. In our method, the Metathesaurus is first screened to identify ambiguous terms whose possible senses consist of two or more MeSH headings. We then use each ambiguous term and its corresponding MeSH heading to extract MEDLINE citations where the term and only one of the MeSH headings co-occur. The term found in the MEDLINE citation is automatically assigned the UMLS CUI linked to the MeSH heading. Each instance has been assigned a UMLS Concept Unique Identifier (CUI). We compare the characteristics of the MSH WSD data set to the previously existing NLM WSD data set. The resulting MSH WSD data set consists of 106 ambiguous abbreviations, 88 ambiguous terms and 9 which are a combination of both, for a total of 203 ambiguous entities. For each ambiguous term/abbreviation, the data set contains a maximum of 100 instances per sense obtained from MEDLINE. The MSH WSD data set allows the evaluation of WSD algorithms in the biomedical domain. Compared to previously existing data sets, MSH WSD contains a larger number of biomedical terms/abbreviations and covers the largest set of UMLS Semantic Types. Furthermore, the MSH WSD data set has been generated automatically reusing already existing annotations and, therefore, can be regenerated from subsequent UMLS versions. BACKGROUND: Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or too focused on specific types of entities (e.g. diseases or genes). We present a method that can be used to automatically develop a WSD test collection using the Unified Medical Language System (UMLS) Metathesaurus and the manual MeSH indexing of MEDLINE. We demonstrate the use of this method by developing such a data set, called MSH WSD. METHODS: In our method, the Metathesaurus is first screened to identify ambiguous terms whose possible senses consist of two or more MeSH headings. We then use each ambiguous term and its corresponding MeSH heading to extract MEDLINE citations where the term and only one of the MeSH headings co-occur. The term found in the MEDLINE citation is automatically assigned the UMLS CUI linked to the MeSH heading. Each instance has been assigned a UMLS Concept Unique Identifier (CUI). We compare the characteristics of the MSH WSD data set to the previously existing NLM WSD data set. RESULTS: The resulting MSH WSD data set consists of 106 ambiguous abbreviations, 88 ambiguous terms and 9 which are a combination of both, for a total of 203 ambiguous entities. For each ambiguous term/abbreviation, the data set contains a maximum of 100 instances per sense obtained from MEDLINE.We evaluated the reliability of the MSH WSD data set using existing knowledge-based methods and compared their performance to that of the results previously obtained by these algorithms on the pre-existing data set, NLM WSD. We show that the knowledge-based methods achieve different results but keep their relative performance except for the Journal Descriptor Indexing (JDI) method, whose performance is below the other methods. CONCLUSIONS: The MSH WSD data set allows the evaluation of WSD algorithms in the biomedical domain. Compared to previously existing data sets, MSH WSD contains a larger number of biomedical terms/abbreviations and covers the largest set of UMLS Semantic Types. Furthermore, the MSH WSD data set has been generated automatically reusing already existing annotations and, therefore, can be regenerated from subsequent UMLS versions.
ArticleNumber	223
Audience	Academic
Author	Jimeno-Yepes, Antonio J Aronson, Alan R McInnes, Bridget T
AuthorAffiliation	1 National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894, USA 2 Department of Pharmacology, University of Minnesota Twin Cities, Minneapolis, MN 55155, USA
AuthorAffiliation_xml	– name: 1 National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894, USA – name: 2 Department of Pharmacology, University of Minnesota Twin Cities, Minneapolis, MN 55155, USA
Author_xml	– sequence: 1 givenname: Antonio J surname: Jimeno-Yepes fullname: Jimeno-Yepes, Antonio J – sequence: 2 givenname: Bridget T surname: McInnes fullname: McInnes, Bridget T – sequence: 3 givenname: Alan R surname: Aronson fullname: Aronson, Alan R
BackLink	https://www.ncbi.nlm.nih.gov/pubmed/21635749$$D View this record in MEDLINE/PubMed
BookMark	eNp1kk1v1DAQhiNURD_gzglZ4oA4pNhO7CQXpGpZ6EpbkChI3CzHngRXib3YXlj-PU63rRrUygd7Zt55bM_McXZgnYUse0nwKSE1f0fKiuSUYJYTmlNaPMmO7lwH986H2XEIVxiTqsbsWXZICS9YVTZH2Y_lbjM4E43t0QVcniNjNewmy1h0sfywXn1eouhQDxa8jIAk0jJKFCCiznn0x3mdDBsAaRPk2Jp-K6Nx9nn2tJNDgBc3-0n2_ePy2-I8X3_5tFqcrfOWUxpzDcA7TKRqoFCU8qbVwEuqqqaFmkKlSfID5grLqiSlZE3HGq46kF1ZdFgXJ9lqz9VOXomNN6P0f4WTRlw7nO-F9NGoAQQGzRgnXGLCSo1pQ3XNp2NVtwWjMrHe71mbbTuCVmCjl8MMOo9Y81P07rcoCC04IQmw2ANa4x4BzCPKjWLqkpi6JAgVqYmJ8ubmGd792kKIYjRBwTBIC24bRF2VUyfr6b7Xe2Uv0_-M7VyiqkktzihrKs445kl1-oAqLQ2jUWmiOpP8s4S3s4SkibCLvdyGIFaXX-faV_drdvfX2xFLAr4XKO9C8NAJZeL1iKRXmEEQLKZZfqgM-L_EW_ajKf8ATFry4g
CitedBy_id	crossref_primary_10_1016_j_jbi_2018_06_007 crossref_primary_10_1136_amiajnl_2012_001244 crossref_primary_10_1186_1471_2105_12_355 crossref_primary_10_1186_s13326_023_00282_y crossref_primary_10_1007_s11042_022_13242_y crossref_primary_10_1186_s12911_016_0296_1 crossref_primary_10_2139_ssrn_3199176 crossref_primary_10_1186_s12859_019_3079_8 crossref_primary_10_1007_s10115_014_0753_z crossref_primary_10_1038_s41597_021_00929_4 crossref_primary_10_1186_s13326_017_0123_3 crossref_primary_10_1016_j_jbi_2016_10_020 crossref_primary_10_2196_56955 crossref_primary_10_14778_3551793_3551812 crossref_primary_10_1093_jamia_ocy189 crossref_primary_10_1038_s41597_024_03317_w crossref_primary_10_1016_j_websem_2014_07_007 crossref_primary_10_1093_bioinformatics_btw529 crossref_primary_10_1136_amiajnl_2012_001350 crossref_primary_10_1093_jamia_ocaa269 crossref_primary_10_1016_j_jbi_2017_08_001 crossref_primary_10_1016_j_procs_2013_09_039 crossref_primary_10_1016_j_jbi_2022_104229 crossref_primary_10_7763_IJBBB_2014_V4_356 crossref_primary_10_5808_gi_21014 crossref_primary_10_1093_bib_bbaa057 crossref_primary_10_1109_ACCESS_2023_3272056 crossref_primary_10_1016_j_jbi_2013_09_009 crossref_primary_10_1109_ACCESS_2019_2912584 crossref_primary_10_1093_jamia_ocy013 crossref_primary_10_1093_database_baac047 crossref_primary_10_1016_j_jbi_2014_11_015 crossref_primary_10_1016_j_artmed_2018_03_002 crossref_primary_10_1016_j_patcog_2017_10_028 crossref_primary_10_1515_jib_2017_0051 crossref_primary_10_1016_j_jbi_2013_08_008 crossref_primary_10_1093_jamia_ocaa136
Cites_doi	10.1186/1471-2105-6-S1-S1 10.1197/jamia.M1101 10.1186/1471-2105-11-569 10.1093/nar/gkh061 10.3115/992730.992783 10.1186/1471-2105-9-S3-S3 10.1093/bioinformatics/bti586 10.1006/jbin.2001.1023 10.1002/asi.20257 10.3115/1075527.1075579 10.1016/j.ijmedinf.2005.03.013 10.1197/jamia.M1533
ContentType	Journal Article
Copyright	COPYRIGHT 2011 BioMed Central Ltd. Copyright ©2011 Jimeno-Yepes et al; licensee BioMed Central Ltd. 2011 Jimeno-Yepes et al; licensee BioMed Central Ltd.
Copyright_xml	– notice: COPYRIGHT 2011 BioMed Central Ltd. – notice: Copyright ©2011 Jimeno-Yepes et al; licensee BioMed Central Ltd. 2011 Jimeno-Yepes et al; licensee BioMed Central Ltd.
DBID	AAYXX CITATION CGR CUY CVF ECM EIF NPM ISR 7X8 5PM DOA
DOI	10.1186/1471-2105-12-223
DatabaseName	CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Gale In Context: Science MEDLINE - Academic PubMed Central (Full Participant titles) DOAJ Directory of Open Access Journals
DatabaseTitle	CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic
DatabaseTitleList	MEDLINE - Academic MEDLINE
Database_xml	– sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 3 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Biology
EISSN	1471-2105
EndPage	223
ExternalDocumentID	oai_doaj_org_article_0ed55616a0154d0292d86154d78b352a PMC3123611 oai_biomedcentral_com_1471_2105_12_223 A259765606 21635749 10_1186_1471_2105_12_223
Genre	Journal Article Research Support, N.I.H., Intramural
GeographicLocations	United States
GeographicLocations_xml	– name: United States
GrantInformation_xml	– fundername: Intramural NIH HHS
GroupedDBID	--- 0R~ 23N 2VQ 2WC 4.4 53G 5VS 6J9 7X7 88E 8AO 8FE 8FG 8FH 8FI 8FJ AAFWJ AAJSJ AAKPC AASML AAYXX ABDBF ABUWG ACGFO ACGFS ACIHN ACIWK ACPRK ACUHS ADBBV ADMLS ADRAZ ADUKV AEAQA AENEX AEUYN AFKRA AFPKN AFRAH AHBYD AHMBA AHSBF AHYZX ALIPV ALMA_UNASSIGNED_HOLDINGS AMKLP AMTXH AOIJS ARAPS AZQEC BAPOH BAWUL BBNVY BCNDV BENPR BFQNJ BGLVJ BHPHI BMC BPHCQ BVXVI C1A C6C CCPQU CITATION CS3 DIK DU5 DWQXO E3Z EAD EAP EAS EBD EBLON EBS EJD EMB EMK EMOBN ESX F5P FYUFA GNUQQ GROUPED_DOAJ GX1 H13 HCIFZ HMCUK HYE IAO ICD IHR INH INR IPNFZ ISR ITC K6V K7- KQ8 LK8 M1P M48 M7P MK~ ML0 M~E O5R O5S OK1 OVT P2P P62 PGMZT PHGZM PHGZT PIMPY PQQKQ PROAC PSQYO RBZ RIG RNS ROL RPM RSV SBL SOJ SV3 TR2 TUS UKHRP W2D WOQ WOW XH6 XSB CGR CUY CVF ECM EIF NPM PJZUB PPXIY PQGLB PMFND 7X8 -A0 3V. ABVAZ ACRMQ ADINQ AFGXO AFNRJ C24 M0N 5PM PUEGO
ID	FETCH-LOGICAL-b622t-dee6f01ac9e3c2269bde642c79be82e7d1e3ce06c0a7414a59f596cfeaf43f0d3
IEDL.DBID	RBZ
ISSN	1471-2105
IngestDate	Wed Aug 27 01:30:21 EDT 2025 Thu Aug 21 18:17:33 EDT 2025 Wed May 22 07:17:00 EDT 2024 Fri Jul 11 07:05:54 EDT 2025 Tue Jun 17 21:39:09 EDT 2025 Tue Jun 10 20:40:11 EDT 2025 Fri Jun 27 05:06:39 EDT 2025 Mon Jul 21 05:27:44 EDT 2025 Tue Jul 01 03:38:13 EDT 2025 Thu Apr 24 23:09:01 EDT 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	1
Language	English
License	http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-b622t-dee6f01ac9e3c2269bde642c79be82e7d1e3ce06c0a7414a59f596cfeaf43f0d3
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
OpenAccessLink	http://dx.doi.org/10.1186/1471-2105-12-223
PMID	21635749
PQID	874017881
PQPubID	23479
ParticipantIDs	doaj_primary_oai_doaj_org_article_0ed55616a0154d0292d86154d78b352a pubmedcentral_primary_oai_pubmedcentral_nih_gov_3123611 biomedcentral_primary_oai_biomedcentral_com_1471_2105_12_223 proquest_miscellaneous_874017881 gale_infotracmisc_A259765606 gale_infotracacademiconefile_A259765606 gale_incontextgauss_ISR_A259765606 pubmed_primary_21635749 crossref_citationtrail_10_1186_1471_2105_12_223 crossref_primary_10_1186_1471_2105_12_223
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2011-06-02
PublicationDateYYYYMMDD	2011-06-02
PublicationDate_xml	– month: 06 year: 2011 text: 2011-06-02 day: 02
PublicationDecade	2010
PublicationPlace	England
PublicationPlace_xml	– name: England
PublicationTitle	BMC bioinformatics
PublicationTitleAlternate	BMC Bioinformatics
PublicationYear	2011
Publisher	BioMed Central Ltd BioMed Central BMC
Publisher_xml	– name: BioMed Central Ltd – name: BioMed Central – name: BMC
References	S Gaudan (4593_CR5) 2005; 21 G Leroy (4593_CR34) 2005; 74 A Jimeno (4593_CR3) 2008; 9 C Manning (4593_CR7) 2000 4593_CR17 4593_CR16 4593_CR19 4593_CR18 H Liu (4593_CR11) 2002; 9 4593_CR31 A Schwartz (4593_CR14) 2003; 8 S Humphrey (4593_CR26) 2006; 57 4593_CR32 B McInnes (4593_CR27) 2008 4593_CR35 L Hirschman (4593_CR1) 2005; 6 J Fan (4593_CR15) 2009 C Leacock (4593_CR30) 1998; 24 T Pedersen (4593_CR8) 2010 R Leaman (4593_CR4) 2009 H Liu (4593_CR10) 2001; 34 H Liu (4593_CR12) 2004; 11 B McInnes (4593_CR28) 2009 A Yeh (4593_CR33) 2000 WA Gale (4593_CR9) 1992 4593_CR29 M Stevenson (4593_CR13) 2009 4593_CR20 A Jimeno-Yepes (4593_CR25) 2010; 11 4593_CR22 M Weeber (4593_CR6) 2001 4593_CR21 P Pezik (4593_CR2) 2008 4593_CR24 4593_CR23 11977807 - J Biomed Inform. 2001 Aug;34(4):249-61 15897005 - Int J Med Inform. 2005 Aug;74(7-8):573-85 12386113 - J Am Med Inform Assoc. 2002 Nov-Dec;9(6):621-36 14681409 - Nucleic Acids Res. 2004 Jan 1;32(Database issue):D267-70 15064284 - J Am Med Inform Assoc. 2004 Jul-Aug;11(4):320-31 11825285 - Proc AMIA Symp. 2001;:746-50 19890434 - J Am Soc Inf Sci Technol. 2006 Jan 1;57(1):96-113 21092226 - BMC Bioinformatics. 2010;11:569 12603049 - Pac Symp Biocomput. 2003;:451-62 18426548 - BMC Bioinformatics. 2008;9 Suppl 3:S3 15960821 - BMC Bioinformatics. 2005;6 Suppl 1:S1 20351846 - AMIA Annu Symp Proc. 2009;2009:183-7 16037121 - Bioinformatics. 2005 Sep 15;21(18):3658-64
References_xml	– ident: 4593_CR18 – ident: 4593_CR20 – start-page: 746 volume-title: Proceedings of the AMIA Symposium, American Medical Informatics Association year: 2001 ident: 4593_CR6 – ident: 4593_CR22 – ident: 4593_CR24 – volume: 6 start-page: S1 issue: Suppl 1 year: 2005 ident: 4593_CR1 publication-title: BMC bioinformatics doi: 10.1186/1471-2105-6-S1-S1 – volume-title: Building and evaluating resources for biomedical text mining, LREC Workshop year: 2008 ident: 4593_CR2 – volume: 9 start-page: 621 issue: 6 year: 2002 ident: 4593_CR11 publication-title: Journal of the American Medical Informatics Association doi: 10.1197/jamia.M1101 – volume: 11 start-page: 565 year: 2010 ident: 4593_CR25 publication-title: BMC bioinformatics doi: 10.1186/1471-2105-11-569 – ident: 4593_CR16 doi: 10.1093/nar/gkh061 – volume-title: PhD thesis year: 2009 ident: 4593_CR28 – ident: 4593_CR29 – start-page: 947 volume-title: Proceedings of the 18th conference on Computational linguistics-Volume 2, Association for Computational Linguistics year: 2000 ident: 4593_CR33 doi: 10.3115/992730.992783 – volume-title: Proceedings of the 2009 Symposium on Languages in Biology and Medicine year: 2009 ident: 4593_CR4 – ident: 4593_CR32 – volume: 9 start-page: S3 issue: Suppl 3 year: 2008 ident: 4593_CR3 publication-title: BMC bioinformatics doi: 10.1186/1471-2105-9-S3-S3 – volume-title: Proceedings of the 1st ACM International Health Informatics Symposium, Arlington, VA year: 2010 ident: 4593_CR8 – ident: 4593_CR19 – volume: 21 start-page: 3658 issue: 18 year: 2005 ident: 4593_CR5 publication-title: Bioinformatics doi: 10.1093/bioinformatics/bti586 – volume: 24 start-page: 147 year: 1998 ident: 4593_CR30 publication-title: Computational Linguistics – ident: 4593_CR17 – ident: 4593_CR21 – ident: 4593_CR23 – volume: 8 start-page: 451 year: 2003 ident: 4593_CR14 publication-title: Pacific Symposium on Biocomputing – volume: 34 start-page: 249 issue: 4 year: 2001 ident: 4593_CR10 publication-title: Journal of Biomedical Informatics doi: 10.1006/jbin.2001.1023 – start-page: 49 volume-title: Proceedings of the ACL-08: HLT Student Research Workshop, Columbus, Ohio: Association for Computational Linguistics year: 2008 ident: 4593_CR27 – start-page: 71 volume-title: Proceedings of the Workshop on BioNLP, Association for Computational Linguistics year: 2009 ident: 4593_CR13 – volume: 57 start-page: 96 year: 2006 ident: 4593_CR26 publication-title: Journal of the American Society for Information Science and Technology (Print) doi: 10.1002/asi.20257 – ident: 4593_CR31 – start-page: 233 volume-title: HLT '91: Proceedings of the workshop on Speech and Natural Language, Morristown, NJ, USA: Association for Computational Linguistics year: 1992 ident: 4593_CR9 doi: 10.3115/1075527.1075579 – ident: 4593_CR35 – start-page: 183 volume-title: AMIA Annual Symposium Proceedings, Volume 2009, American Medical Informatics Association year: 2009 ident: 4593_CR15 – volume: 74 start-page: 573 issue: 7-8 year: 2005 ident: 4593_CR34 publication-title: International Journal of Medical Informatics doi: 10.1016/j.ijmedinf.2005.03.013 – volume-title: Foundations of statistical natural language processing year: 2000 ident: 4593_CR7 – volume: 11 start-page: 320 issue: 4 year: 2004 ident: 4593_CR12 publication-title: Journal of the American Medical Informatics Association doi: 10.1197/jamia.M1533 – reference: 21092226 - BMC Bioinformatics. 2010;11:569 – reference: 19890434 - J Am Soc Inf Sci Technol. 2006 Jan 1;57(1):96-113 – reference: 11825285 - Proc AMIA Symp. 2001;:746-50 – reference: 18426548 - BMC Bioinformatics. 2008;9 Suppl 3:S3 – reference: 15897005 - Int J Med Inform. 2005 Aug;74(7-8):573-85 – reference: 12386113 - J Am Med Inform Assoc. 2002 Nov-Dec;9(6):621-36 – reference: 16037121 - Bioinformatics. 2005 Sep 15;21(18):3658-64 – reference: 15064284 - J Am Med Inform Assoc. 2004 Jul-Aug;11(4):320-31 – reference: 15960821 - BMC Bioinformatics. 2005;6 Suppl 1:S1 – reference: 11977807 - J Biomed Inform. 2001 Aug;34(4):249-61 – reference: 12603049 - Pac Symp Biocomput. 2003;:451-62 – reference: 20351846 - AMIA Annu Symp Proc. 2009;2009:183-7 – reference: 14681409 - Nucleic Acids Res. 2004 Jan 1;32(Database issue):D267-70
SSID	ssj0017805
Score	2.3442416
Snippet	Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or too focused... In our method, the Metathesaurus is first screened to identify ambiguous terms whose possible senses consist of two or more MeSH headings. We then use each... Background Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or... BACKGROUND: Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or... Abstract Background Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too...
SourceID	doaj pubmedcentral biomedcentral proquest gale pubmed crossref
SourceType	Open Website Open Access Repository Aggregation Database Index Database Enrichment Source
StartPage	223
SubjectTerms	Abstracting and Indexing as Topic Algorithms Humans Intermedin Knowledge Bases Medical Subject Headings MEDLINE Natural Language Processing Online health care information services Physiological aspects Semantics Unified Medical Language System United States
SummonAdditionalLinks	– databaseName: DOAJ Directory of Open Access Journals dbid: DOA link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3di9QwEA9yIPginp_VU4II4kPZfiYN-HLqHnvi3oPnwb6FNJ2shbMrtsvhf-9M2l02qPjiW5ukNMlMOjPpL79h7BWgmTFS5nHtd6sgK2MjSxe70romVcpZQ_uQywuxuCo-rsrVQaovwoSN9MDjxM0SaCiDozBk7JskU1lTCbqUVY3Og3eN0Obtgqnp_wEx9ftzRTKNMagpdz8oKzHblxEoIaM0RcFJ9-vAQHke_9-_1gfmKoRSHtims3vs7uRU8tNxMMfsFnT32e0xzeTPB2zlgXYtAZz5Ei4X3FMk0l3b8eX8w6fzizkfNnztKagH4IYTcJT3MHD0afkNBqh40_XAm7Y33-p2PRKEP2RXZ_Mv7xfxlFEhrkWWDXEDIFySGqsgt-h4qboBDECsVDVUGcgmxXJIhE0MehqFKZUrlbAOjCtylzT5I3bUbTp4wrhE36xWaWbLQhQCrIFSJi5XZBEt5FXE3gbTqr-P7Bma-KzDGlxamqSiSSo6zTRKJWKznRS0ndjKKWnGtfZRSyX-8MSb_RO7d_297TsSbNAnX4CapyfN0__SvIi9JLXQxKLREUxnbbZ9r88vP-tTDCol0RqJiL2eGrkN9t-a6dQDTiIRbwUtT4KWuMxtUM132qepirBxHWy2vfZJFSkrQMQej8q4H1eWEt1goSImAzUNBh7WdO1XTzKee1qe9On_mKln7M64FU949xN2NPzYwnP05Yb6hV-2vwDwt0DL priority: 102 providerName: Directory of Open Access Journals – databaseName: Scholars Portal Journals: Open Access dbid: M48 link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3db9MwELdgCIkXxDeBgSyEhHgIy6cdSyA0oFOH6B4YlfpmOc6lVCopNKlg_z13TrrNUCTekviS2L673J1z_h1jzwHNjJEyDUu3WgVJHhqZ12Gd27qKlaqtoXXIyYkYT7OPs3x2sT16mMB2Z2hH9aSm6-WrXz_O3qLCv3EKX4iDGD-wIYYuOaUZoLm7yq6hXZKkppPs4p8Cofe7vUYD9fan5Y4n_LH7fekZLYft__cX_JIJ89MrL9mro1vs5uBo8sNeMm6zK9DcYdf70pNnd9nMJd8tKOmZT-B0zB1sIp0tGj4Zffh0fDLi3YrPHSx1B9xwSiblLXQc_Vz-E4NWPGla4NWiNd_KxbwHDb_HpkejL-_H4VBlISxFknRhBSDqKDZWQWrRGVNlBRiUWKlKKBKQVYzXIRI2Muh9ZCZXda6ErcHUWVpHVXqf7TWrBh4yLtFfK1Wc2DwTmQBrIJdRnSqykhbSImCvvWnV33tEDU0Y134Lqpsmrmjiio4TjVwJ2MGWC9oOCOZUSGOpXSRTiB13vDy_Y_uuf9O-I8Z6fXIXVuu5HtRYR1BRPVFhyPWsokQlVSHoUBYlurImYM9ILDQhazSUujM3m7bVx6ef9SEGmpKgjkTAXgxE9Qr7b82wEwInkcC4PMp9jxJV33rNfCt9mpooX66B1abVrtAiVQoI2INeGM_HlcQEQZipgElPTL2B-y3N4qsDHk8dVE_86H8G-Zjd6JffKcd9n-116w08Qf-tK586tfwNdGk-TQ priority: 102 providerName: Scholars Portal
Title	Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation
URI	https://www.ncbi.nlm.nih.gov/pubmed/21635749 https://www.proquest.com/docview/874017881 http://dx.doi.org/10.1186/1471-2105-12-223 https://pubmed.ncbi.nlm.nih.gov/PMC3123611 https://doaj.org/article/0ed55616a0154d0292d86154d78b352a
Volume	12
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3db9MwELfYJiReEN8ERmUhJMRDtHzascRLCy1bRSu0MqnixXIcu1QaKSKpEP89d05a5jGeeMmXL1--s-_OPv-OkFcG1IziPA1LN1plkjxUPLehzbWtYiGsVjgOOZuz04tsusyXf2Byrs3gxwU7iaH7DMExyTGIAJTZATlKMtCD6JmPvuxnDBCb360k6ql3U5I3POHa2vZLTyU55P6_--crCsoPnryijSb3yN3ejKTDju_3yS1TPyC3u8SSvx6SpQutW2NIM52ZxSl1oIh4tq7pbPz-49l8TNsNXTnQ6dZQRTFUlDampWDF0p_gksJJ3RharRv1rVyvOkjwR-RiMv787jTscyiEJUuSNqyMYTaKlRYm1WBqibIy4HJoLkpTJIZXMVw3EdORgjrNVC5sLpi2RtkstVGVPiaH9aY2TwnlYI2VIk50nrGMGa1MziObCtSB2qRFQN561Sq_d3gZEhGs_RJoTBK5IpErMk4kcCUgJzsuSN3jk2OajEvp_JSC3XDHm_0du3f9m3aEjPW-yV0AUZN9I5WRqTBbKFNoWFZRIpKqYHjIixIMVRWQlygWEnEzagzMWalt08izxbkcghvJEciIBeR1T2Q38P1a9escoBIRasujPPYooWFrr5jupE9iEUbD1WazbaRLo4h5AALypBPG_X8lMQIMZiIg3BNT78f9knr91cGKpw6IJ372f2x8Tu50w-4Y235MDtsfW_MC7La2HJADvuSwLSYfBuRoOJwuprAfjeefzgduLAS2s6wYuIb9G2rCQY8
linkProvider	BioMedCentral
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Lb9QwELZKEaIXxLMEClgICXEIzct2LHFpYastdHvoQ1pxsRzHXiJtk6rJCvHv8TjJag3lxC2JJw97xp4ZZ-YbhN5pq2YkY2lYuN0qnZBQMmJCQ5QpY86NkrAPOTul08vs65zMt9BszIUprlRRNQNoKAAVf9xMQ1-6tXsMGetnfE73Y7vChtZ3IRBnYPXdHXSXEcKgoMHZ4ff1TwWA73fJRgP1-Nfylif8kf6-9LSWA_f_ewnf0GF-fOWGwjp6iB4MliY-6LvxCG3p-jG619ee_PUEzV30XQVRz3imz6fY4SbCWVXj2eTLyfHpBHcNXjhc6k5jiSGaFLe6w3bY8E_rtdqTutW4rFp5VVSLHjX8Kbo8mlx8noZDmYWwoEnShaXW1ESxVFynylpjvCi19UoU44XOE83K2F7XEVWRtOZHJgk3hFNltDRZaqIyfYa266bWzxFm1mAreJwoktGMaiU1YZFJOahJpdM8QJ-8YRXXPaSGAJBrv8UyWgBXBHBFxImwXAnQ_sgFoQYIc6iksRTOlcnpLXd8WN8xvuvftIfAWO-b3IXmZiEGmRORLqGgKJVge5ZRwpMyp3DI8sLasjJAb0EsBEBr1BC7s5CrthXH52fiwHqaDLCOaIDeD0Smsd-v5JAKYQcR0Lg8yj2P0s595TXjUfoENEHAXK2bVStcpUUoFRCg3V4Y1_1KYsAgzHiAmCemXsf9lrr64ZDHU4fVE7_4Pza-QfenF7MTYQX520u00-_SQyj8Htrublb6lTXzuuK1m7y_AU7sTB4
linkToPdf	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3db9MwELfGEIgXxOcIDLAQEuIhNF-2Y4mXjbXqgFZoY9LEi-U4donokmlJhfjv8TlpVcN44i2JL03ju_PdOXe_Q-i1tmZGMpaGhdut0gkJJSMmNESZMubcKAn7kLM5nZ5lH8_J-Q6ar2thigtVVM0AGgpAxe-2y9CXbu22B-rH6LI0vcrndBTbJTa0wQuBRANr8G6gm4wQBop6cvht81UB8PtdtdFAvf5sec0v_FH_vvTMlkP3_3sN3zJifoLllsWa3EN3B1cTH_SycR_t6PoButU3n_z1EJ279LsK0p7xTJ9OsQNOhLOqxrPx0efj-Rh3DV44YOpOY4khnRS3usN23vBPG7bak7rVuKxaeVFUix42_BE6m4y_fpiGQ5-FsKBJ0oWl1tREsVRcp8q6Y7wotQ1LFOOFzhPNythe1xFVkbT-RyYJN4RTZbQ0WWqiMn2Mduum1k8QZtZjK3icKJLRjGolNWGRSTnYSaXTPEDvvWkVlz2mhgCUa3_EcloAVwRwRcSJsFwJ0GjNBaEGDHNopbEULpbJ6TV3vN3csX7Wv2kPgbHef3IXmquFGBRZRLqEjqJUgvNZRglPypzCIcsL68zKAL0CsRCArVFD8s5CrtpWHJ-eiAMbajIAO6IBejMQmQakWQ61EHYSAY7Lo9z3KK3yK28Yr6VPwBBkzNW6WbXCtVqEXgEB2uuFcfNeSQwghBkPEPPE1Htxf6Suvjvo8dSB9cRP_4-NL9HtL0cTYeX40zN0p9-lh1T4fbTbXa30c-vmdcULp7u_AbY6S-k
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Exploiting+MeSH+indexing+in+MEDLINE+to+generate+a+data+set+for+word+sense+disambiguation&rft.jtitle=BMC+bioinformatics&rft.au=Jimeno-Yepes%2C+Antonio+J&rft.au=McInnes%2C+Bridget+T&rft.au=Aronson%2C+Alan+R&rft.date=2011-06-02&rft.pub=BioMed+Central+Ltd&rft.issn=1471-2105&rft.eissn=1471-2105&rft.volume=12&rft.spage=223&rft_id=info:doi/10.1186%2F1471-2105-12-223&rft.externalDBID=ISR&rft.externalDocID=A259765606
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1471-2105&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1471-2105&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1471-2105&client=summon