MeSH: a window into full text for document summarization
Motivation: Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records. Recent research initiatives such as TREC Genomics and BioCreAtIvE strongly point to the merits of moving beyond abstracts and into the real...
Saved in:
Published in | Bioinformatics Vol. 27; no. 13; pp. i120 - i128 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
England
Oxford University Press
01.07.2011
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Motivation: Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records. Recent research initiatives such as TREC Genomics and BioCreAtIvE strongly point to the merits of moving beyond abstracts and into the realm of full texts. Full texts are, however, more expensive to process not only in terms of resources needed but also in terms of accuracy. Since full texts contain embellishments that elaborate, contextualize, contrast, supplement, etc., there is greater risk for false positives. Motivated by this, we explore an approach that offers a compromise between the extremes of abstracts and full texts. Specifically, we create reduced versions of full text documents that contain only important portions. In the long-term, our goal is to explore the use of such summaries for functions such as document retrieval and information extraction. Here, we focus on designing summarization strategies. In particular, we explore the use of MeSH terms, manually assigned to documents by trained annotators, as clues to select important text segments from the full text documents.
Results: Our experiments confirm the ability of our approach to pick the important text portions. Using the ROUGE measures for evaluation, we were able to achieve maximum ROUGE-1, ROUGE-2 and ROUGE-SU4 F-scores of 0.4150, 0.1435 and 0.1782, respectively, for our MeSH term-based method versus the maximum baseline scores of 0.3815, 0.1353 and 0.1428, respectively. Using a MeSH profile-based strategy, we were able to achieve maximum ROUGE F-scores of 0.4320, 0.1497 and 0.1887, respectively. Human evaluation of the baselines and our proposed strategies further corroborates the ability of our method to select important sentences from the full texts.
Contact:
sanmitra-bhattacharya@uiowa.edu; padmini-srinivasan@uiowa.edu |
---|---|
AbstractList | Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records. Recent research initiatives such as TREC Genomics and BioCreAtIvE strongly point to the merits of moving beyond abstracts and into the realm of full texts. Full texts are, however, more expensive to process not only in terms of resources needed but also in terms of accuracy. Since full texts contain embellishments that elaborate, contextualize, contrast, supplement, etc., there is greater risk for false positives. Motivated by this, we explore an approach that offers a compromise between the extremes of abstracts and full texts. Specifically, we create reduced versions of full text documents that contain only important portions. In the long-term, our goal is to explore the use of such summaries for functions such as document retrieval and information extraction. Here, we focus on designing summarization strategies. In particular, we explore the use of MeSH terms, manually assigned to documents by trained annotators, as clues to select important text segments from the full text documents.
Our experiments confirm the ability of our approach to pick the important text portions. Using the ROUGE measures for evaluation, we were able to achieve maximum ROUGE-1, ROUGE-2 and ROUGE-SU4 F-scores of 0.4150, 0.1435 and 0.1782, respectively, for our MeSH term-based method versus the maximum baseline scores of 0.3815, 0.1353 and 0.1428, respectively. Using a MeSH profile-based strategy, we were able to achieve maximum ROUGE F-scores of 0.4320, 0.1497 and 0.1887, respectively. Human evaluation of the baselines and our proposed strategies further corroborates the ability of our method to select important sentences from the full texts.
sanmitra-bhattacharya@uiowa.edu; padmini-srinivasan@uiowa.edu. Motivation: Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records. Recent research initiatives such as TREC Genomics and BioCreAtIvE strongly point to the merits of moving beyond abstracts and into the realm of full texts. Full texts are, however, more expensive to process not only in terms of resources needed but also in terms of accuracy. Since full texts contain embellishments that elaborate, contextualize, contrast, supplement, etc., there is greater risk for false positives. Motivated by this, we explore an approach that offers a compromise between the extremes of abstracts and full texts. Specifically, we create reduced versions of full text documents that contain only important portions. In the long-term, our goal is to explore the use of such summaries for functions such as document retrieval and information extraction. Here, we focus on designing summarization strategies. In particular, we explore the use of MeSH terms, manually assigned to documents by trained annotators, as clues to select important text segments from the full text documents. Results: Our experiments confirm the ability of our approach to pick the important text portions. Using the ROUGE measures for evaluation, we were able to achieve maximum ROUGE-1, ROUGE-2 and ROUGE-SU4 F-scores of 0.4150, 0.1435 and 0.1782, respectively, for our MeSH term-based method versus the maximum baseline scores of 0.3815, 0.1353 and 0.1428, respectively. Using a MeSH profile-based strategy, we were able to achieve maximum ROUGE F-scores of 0.4320, 0.1497 and 0.1887, respectively. Human evaluation of the baselines and our proposed strategies further corroborates the ability of our method to select important sentences from the full texts. Contact: sanmitra-bhattacharya@uiowa.edu; padmini-srinivasan@uiowa.edu MOTIVATIONPrevious research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records. Recent research initiatives such as TREC Genomics and BioCreAtIvE strongly point to the merits of moving beyond abstracts and into the realm of full texts. Full texts are, however, more expensive to process not only in terms of resources needed but also in terms of accuracy. Since full texts contain embellishments that elaborate, contextualize, contrast, supplement, etc., there is greater risk for false positives. Motivated by this, we explore an approach that offers a compromise between the extremes of abstracts and full texts. Specifically, we create reduced versions of full text documents that contain only important portions. In the long-term, our goal is to explore the use of such summaries for functions such as document retrieval and information extraction. Here, we focus on designing summarization strategies. In particular, we explore the use of MeSH terms, manually assigned to documents by trained annotators, as clues to select important text segments from the full text documents. RESULTSOur experiments confirm the ability of our approach to pick the important text portions. Using the ROUGE measures for evaluation, we were able to achieve maximum ROUGE-1, ROUGE-2 and ROUGE-SU4 F-scores of 0.4150, 0.1435 and 0.1782, respectively, for our MeSH term-based method versus the maximum baseline scores of 0.3815, 0.1353 and 0.1428, respectively. Using a MeSH profile-based strategy, we were able to achieve maximum ROUGE F-scores of 0.4320, 0.1497 and 0.1887, respectively. Human evaluation of the baselines and our proposed strategies further corroborates the ability of our method to select important sentences from the full texts. CONTACTsanmitra-bhattacharya@uiowa.edu; padmini-srinivasan@uiowa.edu. Motivation: Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records. Recent research initiatives such as TREC Genomics and BioCreAtIvE strongly point to the merits of moving beyond abstracts and into the realm of full texts. Full texts are, however, more expensive to process not only in terms of resources needed but also in terms of accuracy. Since full texts contain embellishments that elaborate, contextualize, contrast, supplement, etc., there is greater risk for false positives. Motivated by this, we explore an approach that offers a compromise between the extremes of abstracts and full texts. Specifically, we create reduced versions of full text documents that contain only important portions. In the long-term, our goal is to explore the use of such summaries for functions such as document retrieval and information extraction. Here, we focus on designing summarization strategies. In particular, we explore the use of MeSH terms, manually assigned to documents by trained annotators, as clues to select important text segments from the full text documents. Results: Our experiments confirm the ability of our approach to pick the important text portions. Using the ROUGE measures for evaluation, we were able to achieve maximum ROUGE-1, ROUGE-2 and ROUGE-SU4 F-scores of 0.4150, 0.1435 and 0.1782, respectively, for our MeSH term-based method versus the maximum baseline scores of 0.3815, 0.1353 and 0.1428, respectively. Using a MeSH profile-based strategy, we were able to achieve maximum ROUGE F-scores of 0.4320, 0.1497 and 0.1887, respectively. Human evaluation of the baselines and our proposed strategies further corroborates the ability of our method to select important sentences from the full texts. Contact: sanmitra-bhattacharya@uiowa.edu; padmini-srinivasan@uiowa.edu Motivation: Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records. Recent research initiatives such as TREC Genomics and BioCreAtIvE strongly point to the merits of moving beyond abstracts and into the realm of full texts. Full texts are, however, more expensive to process not only in terms of resources needed but also in terms of accuracy. Since full texts contain embellishments that elaborate, contextualize, contrast, supplement, etc., there is greater risk for false positives. Motivated by this, we explore an approach that offers a compromise between the extremes of abstracts and full texts. Specifically, we create reduced versions of full text documents that contain only important portions. In the long-term, our goal is to explore the use of such summaries for functions such as document retrieval and information extraction. Here, we focus on designing summarization strategies. In particular, we explore the use of MeSH terms, manually assigned to documents by trained annotators, as clues to select important text segments from the full text documents. Results: Our experiments confirm the ability of our approach to pick the important text portions. Using the ROUGE measures for evaluation, we were able to achieve maximum ROUGE-1, ROUGE-2 and ROUGE-SU4 F -scores of 0.4150, 0.1435 and 0.1782, respectively, for our MeSH term-based method versus the maximum baseline scores of 0.3815, 0.1353 and 0.1428, respectively. Using a MeSH profile-based strategy, we were able to achieve maximum ROUGE F -scores of 0.4320, 0.1497 and 0.1887, respectively. Human evaluation of the baselines and our proposed strategies further corroborates the ability of our method to select important sentences from the full texts. Contact: sanmitra-bhattacharya@uiowa.edu ; padmini-srinivasan@uiowa.edu |
Author | Ha−Thuc, Viet Bhattacharya, Sanmitra Srinivasan, Padmini |
AuthorAffiliation | 1 Department of Computer Science and 2 Department of Management Sciences, The University of Iowa, Iowa City, IA 52242, USA |
AuthorAffiliation_xml | – name: 1 Department of Computer Science and 2 Department of Management Sciences, The University of Iowa, Iowa City, IA 52242, USA |
Author_xml | – sequence: 1 givenname: Sanmitra surname: Bhattacharya fullname: Bhattacharya, Sanmitra organization: 1Department of Computer Science and 2Department of Management Sciences, The University of Iowa, Iowa City, IA 52242, USA – sequence: 2 givenname: Viet surname: Ha−Thuc fullname: Ha−Thuc, Viet organization: 1Department of Computer Science and 2Department of Management Sciences, The University of Iowa, Iowa City, IA 52242, USA – sequence: 3 givenname: Padmini surname: Srinivasan fullname: Srinivasan, Padmini organization: 1Department of Computer Science and 2Department of Management Sciences, The University of Iowa, Iowa City, IA 52242, USA |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/21685060$$D View this record in MEDLINE/PubMed |
BookMark | eNqNkEtPwzAQhC1URB_wE0C5cQpdx6-EAxKqgCIVcQDOlpM4YJTYJU4o8OsxaqnojdOutLPzjWaMBtZZjdAxhjMMGZnmxhlbubZRnSn8NO_aJCF7aIQphzgBlg3CTriIaQpkiMbevwIwTCk9QMME85QBhxFK7_TD_DxS0crY0q0iYzsXVX1dR53-6KIAiEpX9I22XeT7plGt-QpEZw_RfqVqr482c4Kerq8eZ_N4cX9zO7tcxAVluIsxZ0AE4ZUuVZ4SrAUFIEwVpOSCZUWmQGVASVapBNOcsyrJCRN5qjPFBRAyQRdr32WfN7osQpBW1XLZmpDlUzpl5O7Fmhf57N4lwThws2BwujFo3VuvfScb4wtd18pq13uZCoITLEQalGytLFrnfaurLQWD_Cld7pYu16WHv5O_Ebdfvy0HAawFrl_-0_MbLjmWfA |
CitedBy_id | crossref_primary_10_1371_journal_pone_0112235 crossref_primary_10_1007_s41060_018_0095_0 crossref_primary_10_1093_bioinformatics_bts367 crossref_primary_10_1097_MD_0000000000005585 crossref_primary_10_1093_nar_gky905 crossref_primary_10_1186_s12911_020_01330_8 crossref_primary_10_1016_j_knosys_2020_105964 crossref_primary_10_1186_s13326_017_0123_3 crossref_primary_10_1371_journal_pone_0115671 crossref_primary_10_1016_j_eswa_2012_04_067 crossref_primary_10_1371_journal_pone_0108847 crossref_primary_10_4137_BMI_S22177 crossref_primary_10_1016_j_cmpb_2019_105117 crossref_primary_10_1038_s41598_022_20529_5 crossref_primary_10_1093_nar_gkz973 crossref_primary_10_1371_journal_pone_0251094 crossref_primary_10_1093_nar_gkx1025 crossref_primary_10_3390_computation5030034 crossref_primary_10_1109_TCBB_2020_3016355 crossref_primary_10_1186_s12911_017_0447_z crossref_primary_10_3389_fcell_2020_00075 crossref_primary_10_1016_j_ymeth_2022_03_012 |
Cites_doi | 10.1016/j.ipm.2007.01.018 10.1016/j.ipm.2003.10.006 10.1093/bioinformatics/btp338 10.1147/rd.22.0159 10.3115/974557.974561 10.1016/0306-4573(95)00052-I 10.1145/383952.384042 10.1186/1471-2105-10-46 10.1186/1471-2105-7-220 10.1093/bioinformatics/btp249 10.3115/1596431.1596442 10.1186/1471-2105-7-392 10.1186/1471-2105-11-492 10.1504/IJDMB.2007.012967 10.1002/asi.10389 10.1186/1471-2105-8-S9-S4 10.1197/jamia.M2434 |
ContentType | Journal Article |
Copyright | The Author(s) 2011. Published by Oxford University Press. 2011 |
Copyright_xml | – notice: The Author(s) 2011. Published by Oxford University Press. 2011 |
DBID | TOX CGR CUY CVF ECM EIF NPM AAYXX CITATION 7X8 5PM |
DOI | 10.1093/bioinformatics/btr223 |
DatabaseName | Oxford Journals Open Access Collection Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed CrossRef MEDLINE - Academic PubMed Central (Full Participant titles) |
DatabaseTitle | MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) CrossRef MEDLINE - Academic |
DatabaseTitleList | MEDLINE CrossRef MEDLINE - Academic |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database – sequence: 3 dbid: TOX name: OUP_牛津大学出版社OA刊 url: https://academic.oup.com/journals/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Biology |
EISSN | 1460-2059 1367-4811 |
EndPage | i128 |
ExternalDocumentID | 10_1093_bioinformatics_btr223 21685060 10.1093/bioinformatics/btr223 |
Genre | Research Support, U.S. Gov't, Non-P.H.S Journal Article |
GeographicLocations | United States |
GeographicLocations_xml | – name: United States |
GroupedDBID | -~X .2P .I3 482 48X 5GY AAMVS ABPTD ACGFS ACUFI ADRIX ADZXQ ALMA_UNASSIGNED_HOLDINGS BCRHZ CZ4 EE~ F5P F9B H5~ HAR HW0 IOX KOP KSI KSN NGC Q5Y RD5 ROX ROZ RXO TLC TN5 TOX WH7 ~91 --- -E4 .DC 0R~ 1TH 23N 2WC 4.4 53G 5WA 70D AAIJN AAIMJ AAJKP AAJQQ AAKPC AAMDB AAOGV AAPQZ AAPXW AAUQX AAVAP AAVLN ABEUO ABIXL ABNKS ABQLI ABQTQ ABWST ABXVV ABZBJ ACIWK ACPRK ACYTK ADBBV ADEYI ADEZT ADFTL ADGKP ADGZP ADHKW ADHZD ADOCK ADPDF ADRDM ADRTK ADVEK ADYVW ADZTZ AECKG AEGPL AEJOX AEKKA AEKSI AELWJ AEMDU AENEX AENZO AEPUE AETBJ AEWNT AFFZL AFGWE AFIYH AFOFC AFRAH AFXEN AGINJ AGKEF AGQXC AGSYK AHMBA AHXPO AIJHB AJEEA AJEUX AKHUL AKWXX ALTZX ALUQC APIBT APWMN ARIXL ASPBG AVWKF AXUDD AYOIW AZVOD BAWUL BAYMD BHONS BQDIO BQUQU BSWAC BTQHN C1A C45 CAG CDBKE CGR COF CS3 CUY CVF DAKXR DIK DILTD DU5 D~K EBD EBS ECM EIF EJD EMOBN FEDTE FHSFR FLIZI FLUFQ FOEOM FQBLK GAUVT GJXCC GX1 H13 HZ~ J21 JXSIZ KAQDR KQ8 M-Z M49 MK~ ML0 N9A NLBLG NMDNZ NOMLY NPM NU- NVLIB O0~ O9- OAWHX ODMLO OJQWA OK1 OVD OVEED P2P PAFKI PEELM PQQKQ Q1. R44 RIG RNS ROL RPM RUSNO RW1 SV3 TEORI TJP TR2 W8F WOQ X7H YAYTL YKOAZ YXANX ZKX ~KM AAYXX ABEJV CITATION GROUPED_DOAJ 7X8 5PM AASNB |
ID | FETCH-LOGICAL-c451t-16503736fedab831e740035ac3d6759c9a0a90439fa214b65f2b357b8e9a67033 |
IEDL.DBID | RPM |
ISSN | 1367-4803 |
IngestDate | Tue Sep 17 21:26:54 EDT 2024 Wed Dec 04 00:18:29 EST 2024 Fri Dec 06 04:00:35 EST 2024 Tue Oct 15 23:42:07 EDT 2024 Wed Aug 28 03:24:07 EDT 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 13 |
Language | English |
License | http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c451t-16503736fedab831e740035ac3d6759c9a0a90439fa214b65f2b357b8e9a67033 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
OpenAccessLink | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117369/ |
PMID | 21685060 |
PQID | 873121778 |
PQPubID | 23479 |
ParticipantIDs | pubmedcentral_primary_oai_pubmedcentral_nih_gov_3117369 proquest_miscellaneous_873121778 crossref_primary_10_1093_bioinformatics_btr223 pubmed_primary_21685060 oup_primary_10_1093_bioinformatics_btr223 |
PublicationCentury | 2000 |
PublicationDate | 2011-07-01 |
PublicationDateYYYYMMDD | 2011-07-01 |
PublicationDate_xml | – month: 07 year: 2011 text: 2011-07-01 day: 01 |
PublicationDecade | 2010 |
PublicationPlace | England |
PublicationPlace_xml | – name: England |
PublicationTitle | Bioinformatics |
PublicationTitleAlternate | Bioinformatics |
PublicationYear | 2011 |
Publisher | Oxford University Press |
Publisher_xml | – name: Oxford University Press |
References | 16939640 - BMC Bioinformatics. 2006;7:392 17947623 - J Am Med Inform Assoc. 2008 Jan-Feb;15(1):32-5 20920264 - BMC Bioinformatics. 2010;11:492 15262811 - Bioinformatics. 2004 Aug 4;20 Suppl 1:i290-6 19376821 - Bioinformatics. 2009 Jun 1;25(11):1412-8 19497938 - Bioinformatics. 2009 Aug 1;25(15):1944-51 20351812 - AMIA Annu Symp Proc. 2009;2009:6-10 18047705 - BMC Bioinformatics. 2007;8 Suppl 9:S4 16630348 - BMC Bioinformatics. 2006;7:220 19192280 - BMC Bioinformatics. 2009;10:46 18402049 - Int J Data Min Bioinform. 2007;1(4):389-407 Kupiec (2023012512015777500_B16) 1995 Radev (2023012512015777500_B25) 2001 Jin (2023012512015777500_B14) 2009 Randolph (2023012512015777500_B26) 2005 Bhattacharya (2023012512015777500_B4) 2010 Cohen (2023012512015777500_B7) 2008; 15 Lin (2023012512015777500_B19) 2003 Brandow (2023012512015777500_B5) 1995; 31 Hersh (2023012512015777500_B12) 2006 Radev (2023012512015777500_B24) 2004; 40 Cohen (2023012512015777500_B8) 2010; 11 Radev (2023012512015777500_B23) 2004 Lin (2023012512015777500_B17) 2004 Salton (2023012512015777500_B29) 1971 Johnson (2023012512015777500_B15) 1997 Aone (2023012512015777500_B2) 1999 Luhn (2023012512015777500_B21) 1958; 2 Zhu (2023012512015777500_B35) 2009; 25 Sehgal (2023012512015777500_B30) 2006; 7 Bhattacharya (2023012512015777500_B3) 2010 Fiszman (2023012512015777500_B10) 2004 Srinivasan (2023012512015777500_B32) 2004; 20 Chiang (2023012512015777500_B6) 2006; 7 Conroy (2023012512015777500_B9) 2001 Inouye (2023012512015777500_B13) 2010 Srinivasan (2023012512015777500_B31) 2004; 55 Trieschnigg (2023012512015777500_B33) 2009; 25 Agarwal (2023012512015777500_B1) 2009; 2009 Lin (2023012512015777500_B18) 2009; 10 Hersh (2023012512015777500_B11) 2007 Yoo (2023012512015777500_B34) 2007; 8 Ling (2023012512015777500_B20) 2007; 43 Radev (2023012512015777500_B22) 1998; 24 Reeve (2023012512015777500_B27) 2007; 1 Reynar (2023012512015777500_B28) 1997 |
References_xml | – start-page: 71 volume-title: Advances in Automatic Text Summarization year: 1999 ident: 2023012512015777500_B2 article-title: A trainable summarizer with knowledge acquired from robust NLP techniques contributor: fullname: Aone – volume: 43 start-page: 1777 year: 2007 ident: 2023012512015777500_B20 article-title: Generating gene summaries from biomedical literature: a study of semi-structured summarization publication-title: Inf. Process. Manage doi: 10.1016/j.ipm.2007.01.018 contributor: fullname: Ling – volume: 40 start-page: 919 year: 2004 ident: 2023012512015777500_B24 article-title: Centroid-based summarization of multiple documents publication-title: Inf. Process. Manage doi: 10.1016/j.ipm.2003.10.006 contributor: fullname: Radev – volume: 25 start-page: 1944 year: 2009 ident: 2023012512015777500_B35 article-title: Enhancing MEDLINE document clustering by incorporating MeSH semantic similarity publication-title: Bioinformatics doi: 10.1093/bioinformatics/btp338 contributor: fullname: Zhu – volume: 2 start-page: 159 year: 1958 ident: 2023012512015777500_B21 article-title: The automatic creation of literature abstracts publication-title: IBM J. Res. Dev. doi: 10.1147/rd.22.0159 contributor: fullname: Luhn – start-page: 16 volume-title: Proceedings of the Fifth Conference on Applied Natural Language Processing, ANLC '97 year: 1997 ident: 2023012512015777500_B28 article-title: A maximum entropy approach to identifying sentence boundaries doi: 10.3115/974557.974561 contributor: fullname: Reynar – start-page: 55 volume-title: In Proceedings of the BioCreative III workshop year: 2010 ident: 2023012512015777500_B3 article-title: Cross-species gene normalization at the University of Iowa contributor: fullname: Bhattacharya – volume: 2009 start-page: 6 year: 2009 ident: 2023012512015777500_B1 article-title: FigSum: automatically generating structured text summaries for figures in biomedical literature publication-title: AMIA Annu. Symp. Proc. contributor: fullname: Agarwal – volume-title: The Application of Linguistic Processing to Automatic Abstract Generation year: 1997 ident: 2023012512015777500_B15 contributor: fullname: Johnson – start-page: 68 volume-title: Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '95 year: 1995 ident: 2023012512015777500_B16 article-title: A trainable document summarizer contributor: fullname: Kupiec – volume: 31 start-page: 675 year: 1995 ident: 2023012512015777500_B5 article-title: Automatic condensation of electronic publications by sentence selection publication-title: Inf. Process. Manage doi: 10.1016/0306-4573(95)00052-I contributor: fullname: Brandow – volume-title: Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004) year: 2004 ident: 2023012512015777500_B23 article-title: MEAD - a platform for multidocument multilingual text summarization contributor: fullname: Radev – volume-title: Joensuu University Learning and Instruction Symposium 2005 year: 2005 ident: 2023012512015777500_B26 article-title: Free-marginal multirater kappa: an alternative to Fleiss' fixed-marginal multirater kappa contributor: fullname: Randolph – start-page: 406 volume-title: Proceedings of the 24th Annual Iternational ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '01 year: 2001 ident: 2023012512015777500_B9 article-title: Text summarization via hidden Markov models doi: 10.1145/383952.384042 contributor: fullname: Conroy – volume-title: Research Final Report year: 2010 ident: 2023012512015777500_B13 article-title: Multiple post microblog summarization contributor: fullname: Inouye – volume: 10 start-page: 46 year: 2009 ident: 2023012512015777500_B18 article-title: Is searching full text more effective than searching abstracts? publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-10-46 contributor: fullname: Lin – start-page: 71 volume-title: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1, NAACL '03 year: 2003 ident: 2023012512015777500_B19 article-title: Automatic evaluation of summaries using N-gram co-occurrence statistics contributor: fullname: Lin – start-page: 74 volume-title: Text Summarization Branches Out: Proceedings of the ACL-04 Workshop year: 2004 ident: 2023012512015777500_B17 article-title: ROUGE: a package for automatic evaluation of summaries contributor: fullname: Lin – volume: 7 start-page: 220 year: 2006 ident: 2023012512015777500_B30 article-title: Retrieval with gene queries publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-7-220 contributor: fullname: Sehgal – volume: 25 start-page: 1412 year: 2009 ident: 2023012512015777500_B33 article-title: MeSH Up: effective MeSH text classification for improved document retrieval publication-title: Bioinformatics doi: 10.1093/bioinformatics/btp249 contributor: fullname: Trieschnigg – volume-title: The SMART Retrieval System – Experiments in Automatic Document Processing. year: 1971 ident: 2023012512015777500_B29 contributor: fullname: Salton – volume-title: TREC 2007 genomics track overview. year: 2007 ident: 2023012512015777500_B11 article-title: TREC 2007 genomics track overview contributor: fullname: Hersh – volume-title: TREC 2006 genomics track overview. year: 2006 ident: 2023012512015777500_B12 article-title: TREC 2006 genomics track overview contributor: fullname: Hersh – start-page: 76 volume-title: Proceedings of the HLT-NAACL Workshop on Computational Lexical Semantics, CLS '04 year: 2004 ident: 2023012512015777500_B10 article-title: Abstraction summarization for managing the biomedical research literature doi: 10.3115/1596431.1596442 contributor: fullname: Fiszman – volume: 24 start-page: 470 year: 1998 ident: 2023012512015777500_B22 article-title: Generating natural language summaries from multiple on-line sources publication-title: Comput. Linguist. contributor: fullname: Radev – volume: 20 start-page: i290–i296 issue: l. 1 year: 2004 ident: 2023012512015777500_B32 article-title: Mining MEDLINE for implicit links between dietary substances and diseases publication-title: Bioinformatics contributor: fullname: Srinivasan – volume: 7 start-page: 392 year: 2006 ident: 2023012512015777500_B6 article-title: GeneLibrarian: an effective gene-information summarization and visualization system publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-7-392 contributor: fullname: Chiang – volume: 11 start-page: 492 year: 2010 ident: 2023012512015777500_B8 article-title: The structural and content aspects of abstracts versus bodies of full text journal articles are different publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-11-492 contributor: fullname: Cohen – start-page: 52 volume-title: Proceedings of the BioCreative III workshop year: 2010 ident: 2023012512015777500_B4 article-title: Online gene indexing and retrieval for BioCreative III at the University of Iowa contributor: fullname: Bhattacharya – volume: 1 start-page: 389 year: 2007 ident: 2023012512015777500_B27 article-title: Biomedical text summarisation using concept chains publication-title: Int. J. Data Min. Bioinform. doi: 10.1504/IJDMB.2007.012967 contributor: fullname: Reeve – volume: 55 start-page: 396 year: 2004 ident: 2023012512015777500_B31 article-title: Text mining: generating hypotheses from MEDLINE publication-title: J. Am. Soc. Inf. Sci. Technol. doi: 10.1002/asi.10389 contributor: fullname: Srinivasan – volume: 8 start-page: S4 issue: Suppl. 9 year: 2007 ident: 2023012512015777500_B34 article-title: A coherent graph-based semantic clustering and summarization approach for biomedical literature and a new summarization evaluation method publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-8-S9-S4 contributor: fullname: Yoo – volume: 15 start-page: 32 year: 2008 ident: 2023012512015777500_B7 article-title: Five-way smoking status classification using text hot-spot identification and error-correcting output codes publication-title: J. Am. Med. Inform. Assoc. doi: 10.1197/jamia.M2434 contributor: fullname: Cohen – start-page: 97 volume-title: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing, BioNLP '09 year: 2009 ident: 2023012512015777500_B14 article-title: Towards automatic generation of gene summary contributor: fullname: Jin – volume-title: Proceedings of the Document Understanding Conference. year: 2001 ident: 2023012512015777500_B25 article-title: Experiments in single and multidocument summarization using MEAD contributor: fullname: Radev |
SSID | ssj0051444 ssj0005056 |
Score | 2.2715685 |
Snippet | Motivation: Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE... Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records. Recent... MOTIVATIONPrevious research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records.... Motivation: Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE... |
SourceID | pubmedcentral proquest crossref pubmed oup |
SourceType | Open Access Repository Aggregation Database Index Database Publisher |
StartPage | i120 |
SubjectTerms | Information Storage and Retrieval Medical Subject Headings MEDLINE Original Papers United States |
Title | MeSH: a window into full text for document summarization |
URI | https://www.ncbi.nlm.nih.gov/pubmed/21685060 https://search.proquest.com/docview/873121778 https://pubmed.ncbi.nlm.nih.gov/PMC3117369 |
Volume | 27 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8QwEB5WQfAivl1f5ODFQ7fNpo_EmyzKIqwKurC3kqQJLritaBfx3zvpY7GexEsvTUuYTJiZ5Pu-AbjAEKlDw7E2CY3xQoN7TlHLPMutTTItbKYcUXhyH4-n4d0smvUgarkwFWhfq_kgf10M8vlLha18W2i_xYn5j5MRozRhsfDXYA3Db1uit7iOoGrZ6qTIvJAHrKXtCOaredHIkToJZF-V7xgcKzng2Am3BZ3Y1OG7_Ug7f6Mnf4Sj223YavJIcl3Pdwd6Jt-Fjbqz5Nce8Il5Gl8RST6x5i4-yTwvC-KO2olDehCcHckKvXRHg6TmrzV8zH2Y3t48j8Ze0yTB02FES49iisXQGtZkUnFGTRK620GpWYa1gNBCBlI4_quVQxqqOLJDxaJEcSNkjNudHcB6XuTmCAjWqwGVKqbaChxpFOdRzBUmACLQVNE-DFrzpG-1FkZa32GztGvatDZtHy7RiH8dS1pTp-jh7tpC5qZYfqQ8YRQLp4T34bC2_OqP7cL1IemsyWqAE8_uvkGfqkS0Gx86_veXJ7BZny876O4prJfvS3OGCUqpziuHxOfzw-wbr7Lpeg |
link.rule.ids | 230,314,727,780,784,885,1604,27924,27925,53791,53793 |
linkProvider | National Library of Medicine |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8QwEB50RfQivl2fOXjx0G1j-ki8iSjrY0VQwVtJ0gQXtBXtsvjvnfQh1pN4blrC1wkzk3zfF4BDTJE6NBx7k9AYLzS45hS1zLPc2iTTwmbKCYVHt_HwMbx6ip5mIGq1MBVpX6vxIH95HeTj54pb-faq_ZYn5t-NzhilCYuFPwtzEUsEbZv0ltkRVJe2OjMyL-QBa4U7gvlqXDSGpM4E2VflO6bHyhA4dtZtQSc7dRRvPwrP3_zJHwnpYhmWmkqSnNYzXoEZk6_CfH235Oca8JG5H54QSabYdRdTMs7LgrjNduK4HgRnR7JCT9zmIKkVbI0icx0eL84fzoZec02Cp8OIlh7FIoshHtZkUnFGTRK680GpWYbdgNBCBlI4BayVxzRUcWSPFYsSxY2QMS54tgG9vMjNFhDsWAMqVUy1FTjSKM6jmCssAUSgqaJ9GLTwpG-1G0Zan2KztAttWkPbhyME8a9jSQt1ijHuDi5kborJR8oTRrF1SngfNmvkv7_Y_rg-JJ1_8j3A2Wd3n2BUVTbaTRRt__vNA1gYPoxu0pvL2-sdWKx3mx2Rdxd65fvE7GG5Uqr9Kji_AJqt67M |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9wwEB5RUCsuCGgLKY_6wKWHbGKch80NAaul7a6QWiRuke3YIhIkq21Wq_77jvNYbXqqeo5jWZ_Hmhn7m28ALtBF6shwzE0iY_zI4JlT1DLfcmvTXAubK1coPJ0lk8fo61P8tNHqqyHta1WMypfXUVk8N9zK-asOep5Y8DC9YZSmLBHBPLfBG9iJGRpZn6j37I6wadzqBMn8iIesL94RLFBF1YmSOiHkQNULdJGNKHDi5NvCgYcaVL1tBJ9_cyg3nNJ4H_a6aJJct6s-gC1THsLbtr_k7_fAp-bH5IpIssLMu1qRoqwr4i7cieN7EFwdySu9dBeEpK1i66oyP8Dj-O7nzcTvWiX4Oopp7VMMtBhiYk0uFWfUpJF7I5Sa5ZgRCC1kKIWrgrXykkYqie2lYnGquBEywUPPPsJ2WZXmGAhmrSGVKqHaChxpFOdxwhWGASLUVFEPRj082bxVxMjal2yWDaHNWmg9-IIg_utY0kOdoZ27xwtZmmr5K-Mpo5g-pdyDoxb59Yz9xnmQDvZkPcBJaA-_oGU1UtqdJX367z8_w7uH23H2_X727QR22wtnx-U9he16sTRnGLHU6ryxzT_96uzG |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=MeSH%3A+a+window+into+full+text+for+document+summarization&rft.jtitle=Bioinformatics&rft.au=Bhattacharya%2C+Sanmitra&rft.au=Ha%E2%88%92Thuc%2C+Viet&rft.au=Srinivasan%2C+Padmini&rft.date=2011-07-01&rft.pub=Oxford+University+Press&rft.issn=1367-4803&rft.eissn=1460-2059&rft.volume=27&rft.issue=13&rft.spage=i120&rft.epage=i128&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbtr223&rft.externalDocID=10.1093%2Fbioinformatics%2Fbtr223 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4803&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4803&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4803&client=summon |