Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation
Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or too focused on specific types of entities (e.g. diseases or genes). We present a method that can be used to automatically develop a WSD test collection...
Saved in:
Published in | BMC bioinformatics Vol. 12; no. 1; p. 223 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
England
BioMed Central Ltd
02.06.2011
BioMed Central BMC |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or too focused on specific types of entities (e.g. diseases or genes). We present a method that can be used to automatically develop a WSD test collection using the Unified Medical Language System (UMLS) Metathesaurus and the manual MeSH indexing of MEDLINE. We demonstrate the use of this method by developing such a data set, called MSH WSD.
In our method, the Metathesaurus is first screened to identify ambiguous terms whose possible senses consist of two or more MeSH headings. We then use each ambiguous term and its corresponding MeSH heading to extract MEDLINE citations where the term and only one of the MeSH headings co-occur. The term found in the MEDLINE citation is automatically assigned the UMLS CUI linked to the MeSH heading. Each instance has been assigned a UMLS Concept Unique Identifier (CUI). We compare the characteristics of the MSH WSD data set to the previously existing NLM WSD data set.
The resulting MSH WSD data set consists of 106 ambiguous abbreviations, 88 ambiguous terms and 9 which are a combination of both, for a total of 203 ambiguous entities. For each ambiguous term/abbreviation, the data set contains a maximum of 100 instances per sense obtained from MEDLINE.We evaluated the reliability of the MSH WSD data set using existing knowledge-based methods and compared their performance to that of the results previously obtained by these algorithms on the pre-existing data set, NLM WSD. We show that the knowledge-based methods achieve different results but keep their relative performance except for the Journal Descriptor Indexing (JDI) method, whose performance is below the other methods.
The MSH WSD data set allows the evaluation of WSD algorithms in the biomedical domain. Compared to previously existing data sets, MSH WSD contains a larger number of biomedical terms/abbreviations and covers the largest set of UMLS Semantic Types. Furthermore, the MSH WSD data set has been generated automatically reusing already existing annotations and, therefore, can be regenerated from subsequent UMLS versions. |
---|---|
AbstractList | Abstract Background Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or too focused on specific types of entities (e.g. diseases or genes). We present a method that can be used to automatically develop a WSD test collection using the Unified Medical Language System (UMLS) Metathesaurus and the manual MeSH indexing of MEDLINE. We demonstrate the use of this method by developing such a data set, called MSH WSD. Methods In our method, the Metathesaurus is first screened to identify ambiguous terms whose possible senses consist of two or more MeSH headings. We then use each ambiguous term and its corresponding MeSH heading to extract MEDLINE citations where the term and only one of the MeSH headings co-occur. The term found in the MEDLINE citation is automatically assigned the UMLS CUI linked to the MeSH heading. Each instance has been assigned a UMLS Concept Unique Identifier (CUI). We compare the characteristics of the MSH WSD data set to the previously existing NLM WSD data set. Results The resulting MSH WSD data set consists of 106 ambiguous abbreviations, 88 ambiguous terms and 9 which are a combination of both, for a total of 203 ambiguous entities. For each ambiguous term/abbreviation, the data set contains a maximum of 100 instances per sense obtained from MEDLINE. We evaluated the reliability of the MSH WSD data set using existing knowledge-based methods and compared their performance to that of the results previously obtained by these algorithms on the pre-existing data set, NLM WSD. We show that the knowledge-based methods achieve different results but keep their relative performance except for the Journal Descriptor Indexing (JDI) method, whose performance is below the other methods. Conclusions The MSH WSD data set allows the evaluation of WSD algorithms in the biomedical domain. Compared to previously existing data sets, MSH WSD contains a larger number of biomedical terms/abbreviations and covers the largest set of UMLS Semantic Types. Furthermore, the MSH WSD data set has been generated automatically reusing already existing annotations and, therefore, can be regenerated from subsequent UMLS versions. Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or too focused on specific types of entities (e.g. diseases or genes). We present a method that can be used to automatically develop a WSD test collection using the Unified Medical Language System (UMLS) Metathesaurus and the manual MeSH indexing of MEDLINE. We demonstrate the use of this method by developing such a data set, called MSH WSD.BACKGROUNDEvaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or too focused on specific types of entities (e.g. diseases or genes). We present a method that can be used to automatically develop a WSD test collection using the Unified Medical Language System (UMLS) Metathesaurus and the manual MeSH indexing of MEDLINE. We demonstrate the use of this method by developing such a data set, called MSH WSD.In our method, the Metathesaurus is first screened to identify ambiguous terms whose possible senses consist of two or more MeSH headings. We then use each ambiguous term and its corresponding MeSH heading to extract MEDLINE citations where the term and only one of the MeSH headings co-occur. The term found in the MEDLINE citation is automatically assigned the UMLS CUI linked to the MeSH heading. Each instance has been assigned a UMLS Concept Unique Identifier (CUI). We compare the characteristics of the MSH WSD data set to the previously existing NLM WSD data set.METHODSIn our method, the Metathesaurus is first screened to identify ambiguous terms whose possible senses consist of two or more MeSH headings. We then use each ambiguous term and its corresponding MeSH heading to extract MEDLINE citations where the term and only one of the MeSH headings co-occur. The term found in the MEDLINE citation is automatically assigned the UMLS CUI linked to the MeSH heading. Each instance has been assigned a UMLS Concept Unique Identifier (CUI). We compare the characteristics of the MSH WSD data set to the previously existing NLM WSD data set.The resulting MSH WSD data set consists of 106 ambiguous abbreviations, 88 ambiguous terms and 9 which are a combination of both, for a total of 203 ambiguous entities. For each ambiguous term/abbreviation, the data set contains a maximum of 100 instances per sense obtained from MEDLINE.We evaluated the reliability of the MSH WSD data set using existing knowledge-based methods and compared their performance to that of the results previously obtained by these algorithms on the pre-existing data set, NLM WSD. We show that the knowledge-based methods achieve different results but keep their relative performance except for the Journal Descriptor Indexing (JDI) method, whose performance is below the other methods.RESULTSThe resulting MSH WSD data set consists of 106 ambiguous abbreviations, 88 ambiguous terms and 9 which are a combination of both, for a total of 203 ambiguous entities. For each ambiguous term/abbreviation, the data set contains a maximum of 100 instances per sense obtained from MEDLINE.We evaluated the reliability of the MSH WSD data set using existing knowledge-based methods and compared their performance to that of the results previously obtained by these algorithms on the pre-existing data set, NLM WSD. We show that the knowledge-based methods achieve different results but keep their relative performance except for the Journal Descriptor Indexing (JDI) method, whose performance is below the other methods.The MSH WSD data set allows the evaluation of WSD algorithms in the biomedical domain. Compared to previously existing data sets, MSH WSD contains a larger number of biomedical terms/abbreviations and covers the largest set of UMLS Semantic Types. Furthermore, the MSH WSD data set has been generated automatically reusing already existing annotations and, therefore, can be regenerated from subsequent UMLS versions.CONCLUSIONSThe MSH WSD data set allows the evaluation of WSD algorithms in the biomedical domain. Compared to previously existing data sets, MSH WSD contains a larger number of biomedical terms/abbreviations and covers the largest set of UMLS Semantic Types. Furthermore, the MSH WSD data set has been generated automatically reusing already existing annotations and, therefore, can be regenerated from subsequent UMLS versions. Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or too focused on specific types of entities (e.g. diseases or genes). We present a method that can be used to automatically develop a WSD test collection using the Unified Medical Language System (UMLS) Metathesaurus and the manual MeSH indexing of MEDLINE. We demonstrate the use of this method by developing such a data set, called MSH WSD. In our method, the Metathesaurus is first screened to identify ambiguous terms whose possible senses consist of two or more MeSH headings. We then use each ambiguous term and its corresponding MeSH heading to extract MEDLINE citations where the term and only one of the MeSH headings co-occur. The term found in the MEDLINE citation is automatically assigned the UMLS CUI linked to the MeSH heading. Each instance has been assigned a UMLS Concept Unique Identifier (CUI). We compare the characteristics of the MSH WSD data set to the previously existing NLM WSD data set. The resulting MSH WSD data set consists of 106 ambiguous abbreviations, 88 ambiguous terms and 9 which are a combination of both, for a total of 203 ambiguous entities. For each ambiguous term/abbreviation, the data set contains a maximum of 100 instances per sense obtained from MEDLINE.We evaluated the reliability of the MSH WSD data set using existing knowledge-based methods and compared their performance to that of the results previously obtained by these algorithms on the pre-existing data set, NLM WSD. We show that the knowledge-based methods achieve different results but keep their relative performance except for the Journal Descriptor Indexing (JDI) method, whose performance is below the other methods. The MSH WSD data set allows the evaluation of WSD algorithms in the biomedical domain. Compared to previously existing data sets, MSH WSD contains a larger number of biomedical terms/abbreviations and covers the largest set of UMLS Semantic Types. Furthermore, the MSH WSD data set has been generated automatically reusing already existing annotations and, therefore, can be regenerated from subsequent UMLS versions. In our method, the Metathesaurus is first screened to identify ambiguous terms whose possible senses consist of two or more MeSH headings. We then use each ambiguous term and its corresponding MeSH heading to extract MEDLINE citations where the term and only one of the MeSH headings co-occur. The term found in the MEDLINE citation is automatically assigned the UMLS CUI linked to the MeSH heading. Each instance has been assigned a UMLS Concept Unique Identifier (CUI). We compare the characteristics of the MSH WSD data set to the previously existing NLM WSD data set. The resulting MSH WSD data set consists of 106 ambiguous abbreviations, 88 ambiguous terms and 9 which are a combination of both, for a total of 203 ambiguous entities. For each ambiguous term/abbreviation, the data set contains a maximum of 100 instances per sense obtained from MEDLINE. The MSH WSD data set allows the evaluation of WSD algorithms in the biomedical domain. Compared to previously existing data sets, MSH WSD contains a larger number of biomedical terms/abbreviations and covers the largest set of UMLS Semantic Types. Furthermore, the MSH WSD data set has been generated automatically reusing already existing annotations and, therefore, can be regenerated from subsequent UMLS versions. BACKGROUND: Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or too focused on specific types of entities (e.g. diseases or genes). We present a method that can be used to automatically develop a WSD test collection using the Unified Medical Language System (UMLS) Metathesaurus and the manual MeSH indexing of MEDLINE. We demonstrate the use of this method by developing such a data set, called MSH WSD. METHODS: In our method, the Metathesaurus is first screened to identify ambiguous terms whose possible senses consist of two or more MeSH headings. We then use each ambiguous term and its corresponding MeSH heading to extract MEDLINE citations where the term and only one of the MeSH headings co-occur. The term found in the MEDLINE citation is automatically assigned the UMLS CUI linked to the MeSH heading. Each instance has been assigned a UMLS Concept Unique Identifier (CUI). We compare the characteristics of the MSH WSD data set to the previously existing NLM WSD data set. RESULTS: The resulting MSH WSD data set consists of 106 ambiguous abbreviations, 88 ambiguous terms and 9 which are a combination of both, for a total of 203 ambiguous entities. For each ambiguous term/abbreviation, the data set contains a maximum of 100 instances per sense obtained from MEDLINE.We evaluated the reliability of the MSH WSD data set using existing knowledge-based methods and compared their performance to that of the results previously obtained by these algorithms on the pre-existing data set, NLM WSD. We show that the knowledge-based methods achieve different results but keep their relative performance except for the Journal Descriptor Indexing (JDI) method, whose performance is below the other methods. CONCLUSIONS: The MSH WSD data set allows the evaluation of WSD algorithms in the biomedical domain. Compared to previously existing data sets, MSH WSD contains a larger number of biomedical terms/abbreviations and covers the largest set of UMLS Semantic Types. Furthermore, the MSH WSD data set has been generated automatically reusing already existing annotations and, therefore, can be regenerated from subsequent UMLS versions. |
ArticleNumber | 223 |
Audience | Academic |
Author | Jimeno-Yepes, Antonio J Aronson, Alan R McInnes, Bridget T |
AuthorAffiliation | 1 National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894, USA 2 Department of Pharmacology, University of Minnesota Twin Cities, Minneapolis, MN 55155, USA |
AuthorAffiliation_xml | – name: 1 National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894, USA – name: 2 Department of Pharmacology, University of Minnesota Twin Cities, Minneapolis, MN 55155, USA |
Author_xml | – sequence: 1 givenname: Antonio J surname: Jimeno-Yepes fullname: Jimeno-Yepes, Antonio J – sequence: 2 givenname: Bridget T surname: McInnes fullname: McInnes, Bridget T – sequence: 3 givenname: Alan R surname: Aronson fullname: Aronson, Alan R |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/21635749$$D View this record in MEDLINE/PubMed |
BookMark | eNp1kk1v1DAQhiNURD_gzglZ4oA4pNhO7CQXpGpZ6EpbkChI3CzHngRXib3YXlj-PU63rRrUygd7Zt55bM_McXZgnYUse0nwKSE1f0fKiuSUYJYTmlNaPMmO7lwH986H2XEIVxiTqsbsWXZICS9YVTZH2Y_lbjM4E43t0QVcniNjNewmy1h0sfywXn1eouhQDxa8jIAk0jJKFCCiznn0x3mdDBsAaRPk2Jp-K6Nx9nn2tJNDgBc3-0n2_ePy2-I8X3_5tFqcrfOWUxpzDcA7TKRqoFCU8qbVwEuqqqaFmkKlSfID5grLqiSlZE3HGq46kF1ZdFgXJ9lqz9VOXomNN6P0f4WTRlw7nO-F9NGoAQQGzRgnXGLCSo1pQ3XNp2NVtwWjMrHe71mbbTuCVmCjl8MMOo9Y81P07rcoCC04IQmw2ANa4x4BzCPKjWLqkpi6JAgVqYmJ8ubmGd792kKIYjRBwTBIC24bRF2VUyfr6b7Xe2Uv0_-M7VyiqkktzihrKs445kl1-oAqLQ2jUWmiOpP8s4S3s4SkibCLvdyGIFaXX-faV_drdvfX2xFLAr4XKO9C8NAJZeL1iKRXmEEQLKZZfqgM-L_EW_ajKf8ATFry4g |
CitedBy_id | crossref_primary_10_1016_j_jbi_2018_06_007 crossref_primary_10_1136_amiajnl_2012_001244 crossref_primary_10_1186_1471_2105_12_355 crossref_primary_10_1186_s13326_023_00282_y crossref_primary_10_1007_s11042_022_13242_y crossref_primary_10_1186_s12911_016_0296_1 crossref_primary_10_2139_ssrn_3199176 crossref_primary_10_1186_s12859_019_3079_8 crossref_primary_10_1007_s10115_014_0753_z crossref_primary_10_1038_s41597_021_00929_4 crossref_primary_10_1186_s13326_017_0123_3 crossref_primary_10_1016_j_jbi_2016_10_020 crossref_primary_10_2196_56955 crossref_primary_10_14778_3551793_3551812 crossref_primary_10_1093_jamia_ocy189 crossref_primary_10_1038_s41597_024_03317_w crossref_primary_10_1016_j_websem_2014_07_007 crossref_primary_10_1093_bioinformatics_btw529 crossref_primary_10_1136_amiajnl_2012_001350 crossref_primary_10_1093_jamia_ocaa269 crossref_primary_10_1016_j_jbi_2017_08_001 crossref_primary_10_1016_j_procs_2013_09_039 crossref_primary_10_1016_j_jbi_2022_104229 crossref_primary_10_7763_IJBBB_2014_V4_356 crossref_primary_10_5808_gi_21014 crossref_primary_10_1093_bib_bbaa057 crossref_primary_10_1109_ACCESS_2023_3272056 crossref_primary_10_1016_j_jbi_2013_09_009 crossref_primary_10_1109_ACCESS_2019_2912584 crossref_primary_10_1093_jamia_ocy013 crossref_primary_10_1093_database_baac047 crossref_primary_10_1016_j_jbi_2014_11_015 crossref_primary_10_1016_j_artmed_2018_03_002 crossref_primary_10_1016_j_patcog_2017_10_028 crossref_primary_10_1515_jib_2017_0051 crossref_primary_10_1016_j_jbi_2013_08_008 crossref_primary_10_1093_jamia_ocaa136 |
Cites_doi | 10.1186/1471-2105-6-S1-S1 10.1197/jamia.M1101 10.1186/1471-2105-11-569 10.1093/nar/gkh061 10.3115/992730.992783 10.1186/1471-2105-9-S3-S3 10.1093/bioinformatics/bti586 10.1006/jbin.2001.1023 10.1002/asi.20257 10.3115/1075527.1075579 10.1016/j.ijmedinf.2005.03.013 10.1197/jamia.M1533 |
ContentType | Journal Article |
Copyright | COPYRIGHT 2011 BioMed Central Ltd. Copyright ©2011 Jimeno-Yepes et al; licensee BioMed Central Ltd. 2011 Jimeno-Yepes et al; licensee BioMed Central Ltd. |
Copyright_xml | – notice: COPYRIGHT 2011 BioMed Central Ltd. – notice: Copyright ©2011 Jimeno-Yepes et al; licensee BioMed Central Ltd. 2011 Jimeno-Yepes et al; licensee BioMed Central Ltd. |
DBID | AAYXX CITATION CGR CUY CVF ECM EIF NPM ISR 7X8 5PM DOA |
DOI | 10.1186/1471-2105-12-223 |
DatabaseName | CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Gale In Context: Science MEDLINE - Academic PubMed Central (Full Participant titles) DOAJ Directory of Open Access Journals |
DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic |
DatabaseTitleList | MEDLINE - Academic MEDLINE |
Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 3 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Biology |
EISSN | 1471-2105 |
EndPage | 223 |
ExternalDocumentID | oai_doaj_org_article_0ed55616a0154d0292d86154d78b352a PMC3123611 oai_biomedcentral_com_1471_2105_12_223 A259765606 21635749 10_1186_1471_2105_12_223 |
Genre | Journal Article Research Support, N.I.H., Intramural |
GeographicLocations | United States |
GeographicLocations_xml | – name: United States |
GrantInformation_xml | – fundername: Intramural NIH HHS |
GroupedDBID | --- 0R~ 23N 2VQ 2WC 4.4 53G 5VS 6J9 7X7 88E 8AO 8FE 8FG 8FH 8FI 8FJ AAFWJ AAJSJ AAKPC AASML AAYXX ABDBF ABUWG ACGFO ACGFS ACIHN ACIWK ACPRK ACUHS ADBBV ADMLS ADRAZ ADUKV AEAQA AENEX AEUYN AFKRA AFPKN AFRAH AHBYD AHMBA AHSBF AHYZX ALIPV ALMA_UNASSIGNED_HOLDINGS AMKLP AMTXH AOIJS ARAPS AZQEC BAPOH BAWUL BBNVY BCNDV BENPR BFQNJ BGLVJ BHPHI BMC BPHCQ BVXVI C1A C6C CCPQU CITATION CS3 DIK DU5 DWQXO E3Z EAD EAP EAS EBD EBLON EBS EJD EMB EMK EMOBN ESX F5P FYUFA GNUQQ GROUPED_DOAJ GX1 H13 HCIFZ HMCUK HYE IAO ICD IHR INH INR IPNFZ ISR ITC K6V K7- KQ8 LK8 M1P M48 M7P MK~ ML0 M~E O5R O5S OK1 OVT P2P P62 PGMZT PHGZM PHGZT PIMPY PQQKQ PROAC PSQYO RBZ RIG RNS ROL RPM RSV SBL SOJ SV3 TR2 TUS UKHRP W2D WOQ WOW XH6 XSB CGR CUY CVF ECM EIF NPM PJZUB PPXIY PQGLB PMFND 7X8 -A0 3V. ABVAZ ACRMQ ADINQ AFGXO AFNRJ C24 M0N 5PM PUEGO |
ID | FETCH-LOGICAL-b622t-dee6f01ac9e3c2269bde642c79be82e7d1e3ce06c0a7414a59f596cfeaf43f0d3 |
IEDL.DBID | RBZ |
ISSN | 1471-2105 |
IngestDate | Wed Aug 27 01:30:21 EDT 2025 Thu Aug 21 18:17:33 EDT 2025 Wed May 22 07:17:00 EDT 2024 Fri Jul 11 07:05:54 EDT 2025 Tue Jun 17 21:39:09 EDT 2025 Tue Jun 10 20:40:11 EDT 2025 Fri Jun 27 05:06:39 EDT 2025 Mon Jul 21 05:27:44 EDT 2025 Tue Jul 01 03:38:13 EDT 2025 Thu Apr 24 23:09:01 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 1 |
Language | English |
License | http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-b622t-dee6f01ac9e3c2269bde642c79be82e7d1e3ce06c0a7414a59f596cfeaf43f0d3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
OpenAccessLink | http://dx.doi.org/10.1186/1471-2105-12-223 |
PMID | 21635749 |
PQID | 874017881 |
PQPubID | 23479 |
ParticipantIDs | doaj_primary_oai_doaj_org_article_0ed55616a0154d0292d86154d78b352a pubmedcentral_primary_oai_pubmedcentral_nih_gov_3123611 biomedcentral_primary_oai_biomedcentral_com_1471_2105_12_223 proquest_miscellaneous_874017881 gale_infotracmisc_A259765606 gale_infotracacademiconefile_A259765606 gale_incontextgauss_ISR_A259765606 pubmed_primary_21635749 crossref_citationtrail_10_1186_1471_2105_12_223 crossref_primary_10_1186_1471_2105_12_223 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2011-06-02 |
PublicationDateYYYYMMDD | 2011-06-02 |
PublicationDate_xml | – month: 06 year: 2011 text: 2011-06-02 day: 02 |
PublicationDecade | 2010 |
PublicationPlace | England |
PublicationPlace_xml | – name: England |
PublicationTitle | BMC bioinformatics |
PublicationTitleAlternate | BMC Bioinformatics |
PublicationYear | 2011 |
Publisher | BioMed Central Ltd BioMed Central BMC |
Publisher_xml | – name: BioMed Central Ltd – name: BioMed Central – name: BMC |
References | S Gaudan (4593_CR5) 2005; 21 G Leroy (4593_CR34) 2005; 74 A Jimeno (4593_CR3) 2008; 9 C Manning (4593_CR7) 2000 4593_CR17 4593_CR16 4593_CR19 4593_CR18 H Liu (4593_CR11) 2002; 9 4593_CR31 A Schwartz (4593_CR14) 2003; 8 S Humphrey (4593_CR26) 2006; 57 4593_CR32 B McInnes (4593_CR27) 2008 4593_CR35 L Hirschman (4593_CR1) 2005; 6 J Fan (4593_CR15) 2009 C Leacock (4593_CR30) 1998; 24 T Pedersen (4593_CR8) 2010 R Leaman (4593_CR4) 2009 H Liu (4593_CR10) 2001; 34 H Liu (4593_CR12) 2004; 11 B McInnes (4593_CR28) 2009 A Yeh (4593_CR33) 2000 WA Gale (4593_CR9) 1992 4593_CR29 M Stevenson (4593_CR13) 2009 4593_CR20 A Jimeno-Yepes (4593_CR25) 2010; 11 4593_CR22 M Weeber (4593_CR6) 2001 4593_CR21 P Pezik (4593_CR2) 2008 4593_CR24 4593_CR23 11977807 - J Biomed Inform. 2001 Aug;34(4):249-61 15897005 - Int J Med Inform. 2005 Aug;74(7-8):573-85 12386113 - J Am Med Inform Assoc. 2002 Nov-Dec;9(6):621-36 14681409 - Nucleic Acids Res. 2004 Jan 1;32(Database issue):D267-70 15064284 - J Am Med Inform Assoc. 2004 Jul-Aug;11(4):320-31 11825285 - Proc AMIA Symp. 2001;:746-50 19890434 - J Am Soc Inf Sci Technol. 2006 Jan 1;57(1):96-113 21092226 - BMC Bioinformatics. 2010;11:569 12603049 - Pac Symp Biocomput. 2003;:451-62 18426548 - BMC Bioinformatics. 2008;9 Suppl 3:S3 15960821 - BMC Bioinformatics. 2005;6 Suppl 1:S1 20351846 - AMIA Annu Symp Proc. 2009;2009:183-7 16037121 - Bioinformatics. 2005 Sep 15;21(18):3658-64 |
References_xml | – ident: 4593_CR18 – ident: 4593_CR20 – start-page: 746 volume-title: Proceedings of the AMIA Symposium, American Medical Informatics Association year: 2001 ident: 4593_CR6 – ident: 4593_CR22 – ident: 4593_CR24 – volume: 6 start-page: S1 issue: Suppl 1 year: 2005 ident: 4593_CR1 publication-title: BMC bioinformatics doi: 10.1186/1471-2105-6-S1-S1 – volume-title: Building and evaluating resources for biomedical text mining, LREC Workshop year: 2008 ident: 4593_CR2 – volume: 9 start-page: 621 issue: 6 year: 2002 ident: 4593_CR11 publication-title: Journal of the American Medical Informatics Association doi: 10.1197/jamia.M1101 – volume: 11 start-page: 565 year: 2010 ident: 4593_CR25 publication-title: BMC bioinformatics doi: 10.1186/1471-2105-11-569 – ident: 4593_CR16 doi: 10.1093/nar/gkh061 – volume-title: PhD thesis year: 2009 ident: 4593_CR28 – ident: 4593_CR29 – start-page: 947 volume-title: Proceedings of the 18th conference on Computational linguistics-Volume 2, Association for Computational Linguistics year: 2000 ident: 4593_CR33 doi: 10.3115/992730.992783 – volume-title: Proceedings of the 2009 Symposium on Languages in Biology and Medicine year: 2009 ident: 4593_CR4 – ident: 4593_CR32 – volume: 9 start-page: S3 issue: Suppl 3 year: 2008 ident: 4593_CR3 publication-title: BMC bioinformatics doi: 10.1186/1471-2105-9-S3-S3 – volume-title: Proceedings of the 1st ACM International Health Informatics Symposium, Arlington, VA year: 2010 ident: 4593_CR8 – ident: 4593_CR19 – volume: 21 start-page: 3658 issue: 18 year: 2005 ident: 4593_CR5 publication-title: Bioinformatics doi: 10.1093/bioinformatics/bti586 – volume: 24 start-page: 147 year: 1998 ident: 4593_CR30 publication-title: Computational Linguistics – ident: 4593_CR17 – ident: 4593_CR21 – ident: 4593_CR23 – volume: 8 start-page: 451 year: 2003 ident: 4593_CR14 publication-title: Pacific Symposium on Biocomputing – volume: 34 start-page: 249 issue: 4 year: 2001 ident: 4593_CR10 publication-title: Journal of Biomedical Informatics doi: 10.1006/jbin.2001.1023 – start-page: 49 volume-title: Proceedings of the ACL-08: HLT Student Research Workshop, Columbus, Ohio: Association for Computational Linguistics year: 2008 ident: 4593_CR27 – start-page: 71 volume-title: Proceedings of the Workshop on BioNLP, Association for Computational Linguistics year: 2009 ident: 4593_CR13 – volume: 57 start-page: 96 year: 2006 ident: 4593_CR26 publication-title: Journal of the American Society for Information Science and Technology (Print) doi: 10.1002/asi.20257 – ident: 4593_CR31 – start-page: 233 volume-title: HLT '91: Proceedings of the workshop on Speech and Natural Language, Morristown, NJ, USA: Association for Computational Linguistics year: 1992 ident: 4593_CR9 doi: 10.3115/1075527.1075579 – ident: 4593_CR35 – start-page: 183 volume-title: AMIA Annual Symposium Proceedings, Volume 2009, American Medical Informatics Association year: 2009 ident: 4593_CR15 – volume: 74 start-page: 573 issue: 7-8 year: 2005 ident: 4593_CR34 publication-title: International Journal of Medical Informatics doi: 10.1016/j.ijmedinf.2005.03.013 – volume-title: Foundations of statistical natural language processing year: 2000 ident: 4593_CR7 – volume: 11 start-page: 320 issue: 4 year: 2004 ident: 4593_CR12 publication-title: Journal of the American Medical Informatics Association doi: 10.1197/jamia.M1533 – reference: 21092226 - BMC Bioinformatics. 2010;11:569 – reference: 19890434 - J Am Soc Inf Sci Technol. 2006 Jan 1;57(1):96-113 – reference: 11825285 - Proc AMIA Symp. 2001;:746-50 – reference: 18426548 - BMC Bioinformatics. 2008;9 Suppl 3:S3 – reference: 15897005 - Int J Med Inform. 2005 Aug;74(7-8):573-85 – reference: 12386113 - J Am Med Inform Assoc. 2002 Nov-Dec;9(6):621-36 – reference: 16037121 - Bioinformatics. 2005 Sep 15;21(18):3658-64 – reference: 15064284 - J Am Med Inform Assoc. 2004 Jul-Aug;11(4):320-31 – reference: 15960821 - BMC Bioinformatics. 2005;6 Suppl 1:S1 – reference: 11977807 - J Biomed Inform. 2001 Aug;34(4):249-61 – reference: 12603049 - Pac Symp Biocomput. 2003;:451-62 – reference: 20351846 - AMIA Annu Symp Proc. 2009;2009:183-7 – reference: 14681409 - Nucleic Acids Res. 2004 Jan 1;32(Database issue):D267-70 |
SSID | ssj0017805 |
Score | 2.3442416 |
Snippet | Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or too focused... In our method, the Metathesaurus is first screened to identify ambiguous terms whose possible senses consist of two or more MeSH headings. We then use each... Background Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or... BACKGROUND: Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or... Abstract Background Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too... |
SourceID | doaj pubmedcentral biomedcentral proquest gale pubmed crossref |
SourceType | Open Website Open Access Repository Aggregation Database Index Database Enrichment Source |
StartPage | 223 |
SubjectTerms | Abstracting and Indexing as Topic Algorithms Humans Intermedin Knowledge Bases Medical Subject Headings MEDLINE Natural Language Processing Online health care information services Physiological aspects Semantics Unified Medical Language System United States |
SummonAdditionalLinks | – databaseName: DOAJ Directory of Open Access Journals dbid: DOA link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3di9QwEA9yIPginp_VU4II4kPZfiYN-HLqHnvi3oPnwb6FNJ2shbMrtsvhf-9M2l02qPjiW5ukNMlMOjPpL79h7BWgmTFS5nHtd6sgK2MjSxe70romVcpZQ_uQywuxuCo-rsrVQaovwoSN9MDjxM0SaCiDozBk7JskU1lTCbqUVY3Og3eN0Obtgqnp_wEx9ftzRTKNMagpdz8oKzHblxEoIaM0RcFJ9-vAQHke_9-_1gfmKoRSHtims3vs7uRU8tNxMMfsFnT32e0xzeTPB2zlgXYtAZz5Ei4X3FMk0l3b8eX8w6fzizkfNnztKagH4IYTcJT3MHD0afkNBqh40_XAm7Y33-p2PRKEP2RXZ_Mv7xfxlFEhrkWWDXEDIFySGqsgt-h4qboBDECsVDVUGcgmxXJIhE0MehqFKZUrlbAOjCtylzT5I3bUbTp4wrhE36xWaWbLQhQCrIFSJi5XZBEt5FXE3gbTqr-P7Bma-KzDGlxamqSiSSo6zTRKJWKznRS0ndjKKWnGtfZRSyX-8MSb_RO7d_297TsSbNAnX4CapyfN0__SvIi9JLXQxKLREUxnbbZ9r88vP-tTDCol0RqJiL2eGrkN9t-a6dQDTiIRbwUtT4KWuMxtUM132qepirBxHWy2vfZJFSkrQMQej8q4H1eWEt1goSImAzUNBh7WdO1XTzKee1qe9On_mKln7M64FU949xN2NPzYwnP05Yb6hV-2vwDwt0DL priority: 102 providerName: Directory of Open Access Journals – databaseName: Scholars Portal Journals: Open Access dbid: M48 link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3db9MwELdgCIkXxDeBgSyEhHgIy6cdSyA0oFOH6B4YlfpmOc6lVCopNKlg_z13TrrNUCTekviS2L673J1z_h1jzwHNjJEyDUu3WgVJHhqZ12Gd27qKlaqtoXXIyYkYT7OPs3x2sT16mMB2Z2hH9aSm6-WrXz_O3qLCv3EKX4iDGD-wIYYuOaUZoLm7yq6hXZKkppPs4p8Cofe7vUYD9fan5Y4n_LH7fekZLYft__cX_JIJ89MrL9mro1vs5uBo8sNeMm6zK9DcYdf70pNnd9nMJd8tKOmZT-B0zB1sIp0tGj4Zffh0fDLi3YrPHSx1B9xwSiblLXQc_Vz-E4NWPGla4NWiNd_KxbwHDb_HpkejL-_H4VBlISxFknRhBSDqKDZWQWrRGVNlBRiUWKlKKBKQVYzXIRI2Muh9ZCZXda6ErcHUWVpHVXqf7TWrBh4yLtFfK1Wc2DwTmQBrIJdRnSqykhbSImCvvWnV33tEDU0Y134Lqpsmrmjiio4TjVwJ2MGWC9oOCOZUSGOpXSRTiB13vDy_Y_uuf9O-I8Z6fXIXVuu5HtRYR1BRPVFhyPWsokQlVSHoUBYlurImYM9ILDQhazSUujM3m7bVx6ef9SEGmpKgjkTAXgxE9Qr7b82wEwInkcC4PMp9jxJV33rNfCt9mpooX66B1abVrtAiVQoI2INeGM_HlcQEQZipgElPTL2B-y3N4qsDHk8dVE_86H8G-Zjd6JffKcd9n-116w08Qf-tK586tfwNdGk-TQ priority: 102 providerName: Scholars Portal |
Title | Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation |
URI | https://www.ncbi.nlm.nih.gov/pubmed/21635749 https://www.proquest.com/docview/874017881 http://dx.doi.org/10.1186/1471-2105-12-223 https://pubmed.ncbi.nlm.nih.gov/PMC3123611 https://doaj.org/article/0ed55616a0154d0292d86154d78b352a |
Volume | 12 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3db9MwELfYJiReEN8ERmUhJMRDtHzascRLCy1bRSu0MqnixXIcu1QaKSKpEP89d05a5jGeeMmXL1--s-_OPv-OkFcG1IziPA1LN1plkjxUPLehzbWtYiGsVjgOOZuz04tsusyXf2Byrs3gxwU7iaH7DMExyTGIAJTZATlKMtCD6JmPvuxnDBCb360k6ql3U5I3POHa2vZLTyU55P6_--crCsoPnryijSb3yN3ejKTDju_3yS1TPyC3u8SSvx6SpQutW2NIM52ZxSl1oIh4tq7pbPz-49l8TNsNXTnQ6dZQRTFUlDampWDF0p_gksJJ3RharRv1rVyvOkjwR-RiMv787jTscyiEJUuSNqyMYTaKlRYm1WBqibIy4HJoLkpTJIZXMVw3EdORgjrNVC5sLpi2RtkstVGVPiaH9aY2TwnlYI2VIk50nrGMGa1MziObCtSB2qRFQN561Sq_d3gZEhGs_RJoTBK5IpErMk4kcCUgJzsuSN3jk2OajEvp_JSC3XDHm_0du3f9m3aEjPW-yV0AUZN9I5WRqTBbKFNoWFZRIpKqYHjIixIMVRWQlygWEnEzagzMWalt08izxbkcghvJEciIBeR1T2Q38P1a9escoBIRasujPPYooWFrr5jupE9iEUbD1WazbaRLo4h5AALypBPG_X8lMQIMZiIg3BNT78f9knr91cGKpw6IJ372f2x8Tu50w-4Y235MDtsfW_MC7La2HJADvuSwLSYfBuRoOJwuprAfjeefzgduLAS2s6wYuIb9G2rCQY8 |
linkProvider | BioMedCentral |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Lb9QwELZKEaIXxLMEClgICXEIzct2LHFpYastdHvoQ1pxsRzHXiJtk6rJCvHv8TjJag3lxC2JJw97xp4ZZ-YbhN5pq2YkY2lYuN0qnZBQMmJCQ5QpY86NkrAPOTul08vs65zMt9BszIUprlRRNQNoKAAVf9xMQ1-6tXsMGetnfE73Y7vChtZ3IRBnYPXdHXSXEcKgoMHZ4ff1TwWA73fJRgP1-Nfylif8kf6-9LSWA_f_ewnf0GF-fOWGwjp6iB4MliY-6LvxCG3p-jG619ee_PUEzV30XQVRz3imz6fY4SbCWVXj2eTLyfHpBHcNXjhc6k5jiSGaFLe6w3bY8E_rtdqTutW4rFp5VVSLHjX8Kbo8mlx8noZDmYWwoEnShaXW1ESxVFynylpjvCi19UoU44XOE83K2F7XEVWRtOZHJgk3hFNltDRZaqIyfYa266bWzxFm1mAreJwoktGMaiU1YZFJOahJpdM8QJ-8YRXXPaSGAJBrv8UyWgBXBHBFxImwXAnQ_sgFoQYIc6iksRTOlcnpLXd8WN8xvuvftIfAWO-b3IXmZiEGmRORLqGgKJVge5ZRwpMyp3DI8sLasjJAb0EsBEBr1BC7s5CrthXH52fiwHqaDLCOaIDeD0Smsd-v5JAKYQcR0Lg8yj2P0s595TXjUfoENEHAXK2bVStcpUUoFRCg3V4Y1_1KYsAgzHiAmCemXsf9lrr64ZDHU4fVE7_4Pza-QfenF7MTYQX520u00-_SQyj8Htrublb6lTXzuuK1m7y_AU7sTB4 |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3db9MwELfGEIgXxOcIDLAQEuIhNF-2Y4mXjbXqgFZoY9LEi-U4donokmlJhfjv8TlpVcN44i2JL03ju_PdOXe_Q-i1tmZGMpaGhdut0gkJJSMmNESZMubcKAn7kLM5nZ5lH8_J-Q6ar2thigtVVM0AGgpAxe-2y9CXbu22B-rH6LI0vcrndBTbJTa0wQuBRANr8G6gm4wQBop6cvht81UB8PtdtdFAvf5sec0v_FH_vvTMlkP3_3sN3zJifoLllsWa3EN3B1cTH_SycR_t6PoButU3n_z1EJ279LsK0p7xTJ9OsQNOhLOqxrPx0efj-Rh3DV44YOpOY4khnRS3usN23vBPG7bak7rVuKxaeVFUix42_BE6m4y_fpiGQ5-FsKBJ0oWl1tREsVRcp8q6Y7wotQ1LFOOFzhPNythe1xFVkbT-RyYJN4RTZbQ0WWqiMn2Mduum1k8QZtZjK3icKJLRjGolNWGRSTnYSaXTPEDvvWkVlz2mhgCUa3_EcloAVwRwRcSJsFwJ0GjNBaEGDHNopbEULpbJ6TV3vN3csX7Wv2kPgbHef3IXmquFGBRZRLqEjqJUgvNZRglPypzCIcsL68zKAL0CsRCArVFD8s5CrtpWHJ-eiAMbajIAO6IBejMQmQakWQ61EHYSAY7Lo9z3KK3yK28Yr6VPwBBkzNW6WbXCtVqEXgEB2uuFcfNeSQwghBkPEPPE1Htxf6Suvjvo8dSB9cRP_4-NL9HtL0cTYeX40zN0p9-lh1T4fbTbXa30c-vmdcULp7u_AbY6S-k |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Exploiting+MeSH+indexing+in+MEDLINE+to+generate+a+data+set+for+word+sense+disambiguation&rft.jtitle=BMC+bioinformatics&rft.au=Jimeno-Yepes%2C+Antonio+J&rft.au=McInnes%2C+Bridget+T&rft.au=Aronson%2C+Alan+R&rft.date=2011-06-02&rft.pub=BioMed+Central+Ltd&rft.issn=1471-2105&rft.eissn=1471-2105&rft.volume=12&rft.spage=223&rft_id=info:doi/10.1186%2F1471-2105-12-223&rft.externalDBID=ISR&rft.externalDocID=A259765606 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1471-2105&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1471-2105&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1471-2105&client=summon |