MeSH: a window into full text for document summarization

Motivation: Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records. Recent research initiatives such as TREC Genomics and BioCreAtIvE strongly point to the merits of moving beyond abstracts and into the real...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics Vol. 27; no. 13; pp. i120 - i128
Main Authors Bhattacharya, Sanmitra, Ha−Thuc, Viet, Srinivasan, Padmini
Format Journal Article
LanguageEnglish
Published England Oxford University Press 01.07.2011
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Motivation: Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records. Recent research initiatives such as TREC Genomics and BioCreAtIvE strongly point to the merits of moving beyond abstracts and into the realm of full texts. Full texts are, however, more expensive to process not only in terms of resources needed but also in terms of accuracy. Since full texts contain embellishments that elaborate, contextualize, contrast, supplement, etc., there is greater risk for false positives. Motivated by this, we explore an approach that offers a compromise between the extremes of abstracts and full texts. Specifically, we create reduced versions of full text documents that contain only important portions. In the long-term, our goal is to explore the use of such summaries for functions such as document retrieval and information extraction. Here, we focus on designing summarization strategies. In particular, we explore the use of MeSH terms, manually assigned to documents by trained annotators, as clues to select important text segments from the full text documents. Results: Our experiments confirm the ability of our approach to pick the important text portions. Using the ROUGE measures for evaluation, we were able to achieve maximum ROUGE-1, ROUGE-2 and ROUGE-SU4 F-scores of 0.4150, 0.1435 and 0.1782, respectively, for our MeSH term-based method versus the maximum baseline scores of 0.3815, 0.1353 and 0.1428, respectively. Using a MeSH profile-based strategy, we were able to achieve maximum ROUGE F-scores of 0.4320, 0.1497 and 0.1887, respectively. Human evaluation of the baselines and our proposed strategies further corroborates the ability of our method to select important sentences from the full texts. Contact: sanmitra-bhattacharya@uiowa.edu; padmini-srinivasan@uiowa.edu
AbstractList Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records. Recent research initiatives such as TREC Genomics and BioCreAtIvE strongly point to the merits of moving beyond abstracts and into the realm of full texts. Full texts are, however, more expensive to process not only in terms of resources needed but also in terms of accuracy. Since full texts contain embellishments that elaborate, contextualize, contrast, supplement, etc., there is greater risk for false positives. Motivated by this, we explore an approach that offers a compromise between the extremes of abstracts and full texts. Specifically, we create reduced versions of full text documents that contain only important portions. In the long-term, our goal is to explore the use of such summaries for functions such as document retrieval and information extraction. Here, we focus on designing summarization strategies. In particular, we explore the use of MeSH terms, manually assigned to documents by trained annotators, as clues to select important text segments from the full text documents. Our experiments confirm the ability of our approach to pick the important text portions. Using the ROUGE measures for evaluation, we were able to achieve maximum ROUGE-1, ROUGE-2 and ROUGE-SU4 F-scores of 0.4150, 0.1435 and 0.1782, respectively, for our MeSH term-based method versus the maximum baseline scores of 0.3815, 0.1353 and 0.1428, respectively. Using a MeSH profile-based strategy, we were able to achieve maximum ROUGE F-scores of 0.4320, 0.1497 and 0.1887, respectively. Human evaluation of the baselines and our proposed strategies further corroborates the ability of our method to select important sentences from the full texts. sanmitra-bhattacharya@uiowa.edu; padmini-srinivasan@uiowa.edu.
Motivation: Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records. Recent research initiatives such as TREC Genomics and BioCreAtIvE strongly point to the merits of moving beyond abstracts and into the realm of full texts. Full texts are, however, more expensive to process not only in terms of resources needed but also in terms of accuracy. Since full texts contain embellishments that elaborate, contextualize, contrast, supplement, etc., there is greater risk for false positives. Motivated by this, we explore an approach that offers a compromise between the extremes of abstracts and full texts. Specifically, we create reduced versions of full text documents that contain only important portions. In the long-term, our goal is to explore the use of such summaries for functions such as document retrieval and information extraction. Here, we focus on designing summarization strategies. In particular, we explore the use of MeSH terms, manually assigned to documents by trained annotators, as clues to select important text segments from the full text documents. Results: Our experiments confirm the ability of our approach to pick the important text portions. Using the ROUGE measures for evaluation, we were able to achieve maximum ROUGE-1, ROUGE-2 and ROUGE-SU4 F-scores of 0.4150, 0.1435 and 0.1782, respectively, for our MeSH term-based method versus the maximum baseline scores of 0.3815, 0.1353 and 0.1428, respectively. Using a MeSH profile-based strategy, we were able to achieve maximum ROUGE F-scores of 0.4320, 0.1497 and 0.1887, respectively. Human evaluation of the baselines and our proposed strategies further corroborates the ability of our method to select important sentences from the full texts. Contact:  sanmitra-bhattacharya@uiowa.edu; padmini-srinivasan@uiowa.edu
MOTIVATIONPrevious research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records. Recent research initiatives such as TREC Genomics and BioCreAtIvE strongly point to the merits of moving beyond abstracts and into the realm of full texts. Full texts are, however, more expensive to process not only in terms of resources needed but also in terms of accuracy. Since full texts contain embellishments that elaborate, contextualize, contrast, supplement, etc., there is greater risk for false positives. Motivated by this, we explore an approach that offers a compromise between the extremes of abstracts and full texts. Specifically, we create reduced versions of full text documents that contain only important portions. In the long-term, our goal is to explore the use of such summaries for functions such as document retrieval and information extraction. Here, we focus on designing summarization strategies. In particular, we explore the use of MeSH terms, manually assigned to documents by trained annotators, as clues to select important text segments from the full text documents. RESULTSOur experiments confirm the ability of our approach to pick the important text portions. Using the ROUGE measures for evaluation, we were able to achieve maximum ROUGE-1, ROUGE-2 and ROUGE-SU4 F-scores of 0.4150, 0.1435 and 0.1782, respectively, for our MeSH term-based method versus the maximum baseline scores of 0.3815, 0.1353 and 0.1428, respectively. Using a MeSH profile-based strategy, we were able to achieve maximum ROUGE F-scores of 0.4320, 0.1497 and 0.1887, respectively. Human evaluation of the baselines and our proposed strategies further corroborates the ability of our method to select important sentences from the full texts. CONTACTsanmitra-bhattacharya@uiowa.edu; padmini-srinivasan@uiowa.edu.
Motivation: Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records. Recent research initiatives such as TREC Genomics and BioCreAtIvE strongly point to the merits of moving beyond abstracts and into the realm of full texts. Full texts are, however, more expensive to process not only in terms of resources needed but also in terms of accuracy. Since full texts contain embellishments that elaborate, contextualize, contrast, supplement, etc., there is greater risk for false positives. Motivated by this, we explore an approach that offers a compromise between the extremes of abstracts and full texts. Specifically, we create reduced versions of full text documents that contain only important portions. In the long-term, our goal is to explore the use of such summaries for functions such as document retrieval and information extraction. Here, we focus on designing summarization strategies. In particular, we explore the use of MeSH terms, manually assigned to documents by trained annotators, as clues to select important text segments from the full text documents. Results: Our experiments confirm the ability of our approach to pick the important text portions. Using the ROUGE measures for evaluation, we were able to achieve maximum ROUGE-1, ROUGE-2 and ROUGE-SU4 F-scores of 0.4150, 0.1435 and 0.1782, respectively, for our MeSH term-based method versus the maximum baseline scores of 0.3815, 0.1353 and 0.1428, respectively. Using a MeSH profile-based strategy, we were able to achieve maximum ROUGE F-scores of 0.4320, 0.1497 and 0.1887, respectively. Human evaluation of the baselines and our proposed strategies further corroborates the ability of our method to select important sentences from the full texts. Contact: sanmitra-bhattacharya@uiowa.edu; padmini-srinivasan@uiowa.edu
Motivation: Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records. Recent research initiatives such as TREC Genomics and BioCreAtIvE strongly point to the merits of moving beyond abstracts and into the realm of full texts. Full texts are, however, more expensive to process not only in terms of resources needed but also in terms of accuracy. Since full texts contain embellishments that elaborate, contextualize, contrast, supplement, etc., there is greater risk for false positives. Motivated by this, we explore an approach that offers a compromise between the extremes of abstracts and full texts. Specifically, we create reduced versions of full text documents that contain only important portions. In the long-term, our goal is to explore the use of such summaries for functions such as document retrieval and information extraction. Here, we focus on designing summarization strategies. In particular, we explore the use of MeSH terms, manually assigned to documents by trained annotators, as clues to select important text segments from the full text documents. Results: Our experiments confirm the ability of our approach to pick the important text portions. Using the ROUGE measures for evaluation, we were able to achieve maximum ROUGE-1, ROUGE-2 and ROUGE-SU4 F -scores of 0.4150, 0.1435 and 0.1782, respectively, for our MeSH term-based method versus the maximum baseline scores of 0.3815, 0.1353 and 0.1428, respectively. Using a MeSH profile-based strategy, we were able to achieve maximum ROUGE F -scores of 0.4320, 0.1497 and 0.1887, respectively. Human evaluation of the baselines and our proposed strategies further corroborates the ability of our method to select important sentences from the full texts. Contact: sanmitra-bhattacharya@uiowa.edu ; padmini-srinivasan@uiowa.edu
Author Ha−Thuc, Viet
Bhattacharya, Sanmitra
Srinivasan, Padmini
AuthorAffiliation 1 Department of Computer Science and 2 Department of Management Sciences, The University of Iowa, Iowa City, IA 52242, USA
AuthorAffiliation_xml – name: 1 Department of Computer Science and 2 Department of Management Sciences, The University of Iowa, Iowa City, IA 52242, USA
Author_xml – sequence: 1
  givenname: Sanmitra
  surname: Bhattacharya
  fullname: Bhattacharya, Sanmitra
  organization: 1Department of Computer Science and 2Department of Management Sciences, The University of Iowa, Iowa City, IA 52242, USA
– sequence: 2
  givenname: Viet
  surname: Ha−Thuc
  fullname: Ha−Thuc, Viet
  organization: 1Department of Computer Science and 2Department of Management Sciences, The University of Iowa, Iowa City, IA 52242, USA
– sequence: 3
  givenname: Padmini
  surname: Srinivasan
  fullname: Srinivasan, Padmini
  organization: 1Department of Computer Science and 2Department of Management Sciences, The University of Iowa, Iowa City, IA 52242, USA
BackLink https://www.ncbi.nlm.nih.gov/pubmed/21685060$$D View this record in MEDLINE/PubMed
BookMark eNqNkEtPwzAQhC1URB_wE0C5cQpdx6-EAxKqgCIVcQDOlpM4YJTYJU4o8OsxaqnojdOutLPzjWaMBtZZjdAxhjMMGZnmxhlbubZRnSn8NO_aJCF7aIQphzgBlg3CTriIaQpkiMbevwIwTCk9QMME85QBhxFK7_TD_DxS0crY0q0iYzsXVX1dR53-6KIAiEpX9I22XeT7plGt-QpEZw_RfqVqr482c4Kerq8eZ_N4cX9zO7tcxAVluIsxZ0AE4ZUuVZ4SrAUFIEwVpOSCZUWmQGVASVapBNOcsyrJCRN5qjPFBRAyQRdr32WfN7osQpBW1XLZmpDlUzpl5O7Fmhf57N4lwThws2BwujFo3VuvfScb4wtd18pq13uZCoITLEQalGytLFrnfaurLQWD_Cld7pYu16WHv5O_Ebdfvy0HAawFrl_-0_MbLjmWfA
CitedBy_id crossref_primary_10_1371_journal_pone_0112235
crossref_primary_10_1007_s41060_018_0095_0
crossref_primary_10_1093_bioinformatics_bts367
crossref_primary_10_1097_MD_0000000000005585
crossref_primary_10_1093_nar_gky905
crossref_primary_10_1186_s12911_020_01330_8
crossref_primary_10_1016_j_knosys_2020_105964
crossref_primary_10_1186_s13326_017_0123_3
crossref_primary_10_1371_journal_pone_0115671
crossref_primary_10_1016_j_eswa_2012_04_067
crossref_primary_10_1371_journal_pone_0108847
crossref_primary_10_4137_BMI_S22177
crossref_primary_10_1016_j_cmpb_2019_105117
crossref_primary_10_1038_s41598_022_20529_5
crossref_primary_10_1093_nar_gkz973
crossref_primary_10_1371_journal_pone_0251094
crossref_primary_10_1093_nar_gkx1025
crossref_primary_10_3390_computation5030034
crossref_primary_10_1109_TCBB_2020_3016355
crossref_primary_10_1186_s12911_017_0447_z
crossref_primary_10_3389_fcell_2020_00075
crossref_primary_10_1016_j_ymeth_2022_03_012
Cites_doi 10.1016/j.ipm.2007.01.018
10.1016/j.ipm.2003.10.006
10.1093/bioinformatics/btp338
10.1147/rd.22.0159
10.3115/974557.974561
10.1016/0306-4573(95)00052-I
10.1145/383952.384042
10.1186/1471-2105-10-46
10.1186/1471-2105-7-220
10.1093/bioinformatics/btp249
10.3115/1596431.1596442
10.1186/1471-2105-7-392
10.1186/1471-2105-11-492
10.1504/IJDMB.2007.012967
10.1002/asi.10389
10.1186/1471-2105-8-S9-S4
10.1197/jamia.M2434
ContentType Journal Article
Copyright The Author(s) 2011. Published by Oxford University Press. 2011
Copyright_xml – notice: The Author(s) 2011. Published by Oxford University Press. 2011
DBID TOX
CGR
CUY
CVF
ECM
EIF
NPM
AAYXX
CITATION
7X8
5PM
DOI 10.1093/bioinformatics/btr223
DatabaseName Oxford Journals Open Access Collection
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
CrossRef
MEDLINE - Academic
PubMed Central (Full Participant titles)
DatabaseTitle MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
CrossRef
MEDLINE - Academic
DatabaseTitleList MEDLINE
CrossRef
MEDLINE - Academic


Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
– sequence: 3
  dbid: TOX
  name: OUP_牛津大学出版社OA刊
  url: https://academic.oup.com/journals/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1460-2059
1367-4811
EndPage i128
ExternalDocumentID 10_1093_bioinformatics_btr223
21685060
10.1093/bioinformatics/btr223
Genre Research Support, U.S. Gov't, Non-P.H.S
Journal Article
GeographicLocations United States
GeographicLocations_xml – name: United States
GroupedDBID -~X
.2P
.I3
482
48X
5GY
AAMVS
ABPTD
ACGFS
ACUFI
ADRIX
ADZXQ
ALMA_UNASSIGNED_HOLDINGS
BCRHZ
CZ4
EE~
F5P
F9B
H5~
HAR
HW0
IOX
KOP
KSI
KSN
NGC
Q5Y
RD5
ROX
ROZ
RXO
TLC
TN5
TOX
WH7
~91
---
-E4
.DC
0R~
1TH
23N
2WC
4.4
53G
5WA
70D
AAIJN
AAIMJ
AAJKP
AAJQQ
AAKPC
AAMDB
AAOGV
AAPQZ
AAPXW
AAUQX
AAVAP
AAVLN
ABEUO
ABIXL
ABNKS
ABQLI
ABQTQ
ABWST
ABXVV
ABZBJ
ACIWK
ACPRK
ACYTK
ADBBV
ADEYI
ADEZT
ADFTL
ADGKP
ADGZP
ADHKW
ADHZD
ADOCK
ADPDF
ADRDM
ADRTK
ADVEK
ADYVW
ADZTZ
AECKG
AEGPL
AEJOX
AEKKA
AEKSI
AELWJ
AEMDU
AENEX
AENZO
AEPUE
AETBJ
AEWNT
AFFZL
AFGWE
AFIYH
AFOFC
AFRAH
AFXEN
AGINJ
AGKEF
AGQXC
AGSYK
AHMBA
AHXPO
AIJHB
AJEEA
AJEUX
AKHUL
AKWXX
ALTZX
ALUQC
APIBT
APWMN
ARIXL
ASPBG
AVWKF
AXUDD
AYOIW
AZVOD
BAWUL
BAYMD
BHONS
BQDIO
BQUQU
BSWAC
BTQHN
C1A
C45
CAG
CDBKE
CGR
COF
CS3
CUY
CVF
DAKXR
DIK
DILTD
DU5
D~K
EBD
EBS
ECM
EIF
EJD
EMOBN
FEDTE
FHSFR
FLIZI
FLUFQ
FOEOM
FQBLK
GAUVT
GJXCC
GX1
H13
HZ~
J21
JXSIZ
KAQDR
KQ8
M-Z
M49
MK~
ML0
N9A
NLBLG
NMDNZ
NOMLY
NPM
NU-
NVLIB
O0~
O9-
OAWHX
ODMLO
OJQWA
OK1
OVD
OVEED
P2P
PAFKI
PEELM
PQQKQ
Q1.
R44
RIG
RNS
ROL
RPM
RUSNO
RW1
SV3
TEORI
TJP
TR2
W8F
WOQ
X7H
YAYTL
YKOAZ
YXANX
ZKX
~KM
AAYXX
ABEJV
CITATION
GROUPED_DOAJ
7X8
5PM
AASNB
ID FETCH-LOGICAL-c451t-16503736fedab831e740035ac3d6759c9a0a90439fa214b65f2b357b8e9a67033
IEDL.DBID RPM
ISSN 1367-4803
IngestDate Tue Sep 17 21:26:54 EDT 2024
Wed Dec 04 00:18:29 EST 2024
Fri Dec 06 04:00:35 EST 2024
Tue Oct 15 23:42:07 EDT 2024
Wed Aug 28 03:24:07 EDT 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 13
Language English
License http://creativecommons.org/licenses/by-nc/2.5
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c451t-16503736fedab831e740035ac3d6759c9a0a90439fa214b65f2b357b8e9a67033
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117369/
PMID 21685060
PQID 873121778
PQPubID 23479
ParticipantIDs pubmedcentral_primary_oai_pubmedcentral_nih_gov_3117369
proquest_miscellaneous_873121778
crossref_primary_10_1093_bioinformatics_btr223
pubmed_primary_21685060
oup_primary_10_1093_bioinformatics_btr223
PublicationCentury 2000
PublicationDate 2011-07-01
PublicationDateYYYYMMDD 2011-07-01
PublicationDate_xml – month: 07
  year: 2011
  text: 2011-07-01
  day: 01
PublicationDecade 2010
PublicationPlace England
PublicationPlace_xml – name: England
PublicationTitle Bioinformatics
PublicationTitleAlternate Bioinformatics
PublicationYear 2011
Publisher Oxford University Press
Publisher_xml – name: Oxford University Press
References 16939640 - BMC Bioinformatics. 2006;7:392
17947623 - J Am Med Inform Assoc. 2008 Jan-Feb;15(1):32-5
20920264 - BMC Bioinformatics. 2010;11:492
15262811 - Bioinformatics. 2004 Aug 4;20 Suppl 1:i290-6
19376821 - Bioinformatics. 2009 Jun 1;25(11):1412-8
19497938 - Bioinformatics. 2009 Aug 1;25(15):1944-51
20351812 - AMIA Annu Symp Proc. 2009;2009:6-10
18047705 - BMC Bioinformatics. 2007;8 Suppl 9:S4
16630348 - BMC Bioinformatics. 2006;7:220
19192280 - BMC Bioinformatics. 2009;10:46
18402049 - Int J Data Min Bioinform. 2007;1(4):389-407
Kupiec (2023012512015777500_B16) 1995
Radev (2023012512015777500_B25) 2001
Jin (2023012512015777500_B14) 2009
Randolph (2023012512015777500_B26) 2005
Bhattacharya (2023012512015777500_B4) 2010
Cohen (2023012512015777500_B7) 2008; 15
Lin (2023012512015777500_B19) 2003
Brandow (2023012512015777500_B5) 1995; 31
Hersh (2023012512015777500_B12) 2006
Radev (2023012512015777500_B24) 2004; 40
Cohen (2023012512015777500_B8) 2010; 11
Radev (2023012512015777500_B23) 2004
Lin (2023012512015777500_B17) 2004
Salton (2023012512015777500_B29) 1971
Johnson (2023012512015777500_B15) 1997
Aone (2023012512015777500_B2) 1999
Luhn (2023012512015777500_B21) 1958; 2
Zhu (2023012512015777500_B35) 2009; 25
Sehgal (2023012512015777500_B30) 2006; 7
Bhattacharya (2023012512015777500_B3) 2010
Fiszman (2023012512015777500_B10) 2004
Srinivasan (2023012512015777500_B32) 2004; 20
Chiang (2023012512015777500_B6) 2006; 7
Conroy (2023012512015777500_B9) 2001
Inouye (2023012512015777500_B13) 2010
Srinivasan (2023012512015777500_B31) 2004; 55
Trieschnigg (2023012512015777500_B33) 2009; 25
Agarwal (2023012512015777500_B1) 2009; 2009
Lin (2023012512015777500_B18) 2009; 10
Hersh (2023012512015777500_B11) 2007
Yoo (2023012512015777500_B34) 2007; 8
Ling (2023012512015777500_B20) 2007; 43
Radev (2023012512015777500_B22) 1998; 24
Reeve (2023012512015777500_B27) 2007; 1
Reynar (2023012512015777500_B28) 1997
References_xml – start-page: 71
  volume-title: Advances in Automatic Text Summarization
  year: 1999
  ident: 2023012512015777500_B2
  article-title: A trainable summarizer with knowledge acquired from robust NLP techniques
  contributor:
    fullname: Aone
– volume: 43
  start-page: 1777
  year: 2007
  ident: 2023012512015777500_B20
  article-title: Generating gene summaries from biomedical literature: a study of semi-structured summarization
  publication-title: Inf. Process. Manage
  doi: 10.1016/j.ipm.2007.01.018
  contributor:
    fullname: Ling
– volume: 40
  start-page: 919
  year: 2004
  ident: 2023012512015777500_B24
  article-title: Centroid-based summarization of multiple documents
  publication-title: Inf. Process. Manage
  doi: 10.1016/j.ipm.2003.10.006
  contributor:
    fullname: Radev
– volume: 25
  start-page: 1944
  year: 2009
  ident: 2023012512015777500_B35
  article-title: Enhancing MEDLINE document clustering by incorporating MeSH semantic similarity
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btp338
  contributor:
    fullname: Zhu
– volume: 2
  start-page: 159
  year: 1958
  ident: 2023012512015777500_B21
  article-title: The automatic creation of literature abstracts
  publication-title: IBM J. Res. Dev.
  doi: 10.1147/rd.22.0159
  contributor:
    fullname: Luhn
– start-page: 16
  volume-title: Proceedings of the Fifth Conference on Applied Natural Language Processing, ANLC '97
  year: 1997
  ident: 2023012512015777500_B28
  article-title: A maximum entropy approach to identifying sentence boundaries
  doi: 10.3115/974557.974561
  contributor:
    fullname: Reynar
– start-page: 55
  volume-title: In Proceedings of the BioCreative III workshop
  year: 2010
  ident: 2023012512015777500_B3
  article-title: Cross-species gene normalization at the University of Iowa
  contributor:
    fullname: Bhattacharya
– volume: 2009
  start-page: 6
  year: 2009
  ident: 2023012512015777500_B1
  article-title: FigSum: automatically generating structured text summaries for figures in biomedical literature
  publication-title: AMIA Annu. Symp. Proc.
  contributor:
    fullname: Agarwal
– volume-title: The Application of Linguistic Processing to Automatic Abstract Generation
  year: 1997
  ident: 2023012512015777500_B15
  contributor:
    fullname: Johnson
– start-page: 68
  volume-title: Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '95
  year: 1995
  ident: 2023012512015777500_B16
  article-title: A trainable document summarizer
  contributor:
    fullname: Kupiec
– volume: 31
  start-page: 675
  year: 1995
  ident: 2023012512015777500_B5
  article-title: Automatic condensation of electronic publications by sentence selection
  publication-title: Inf. Process. Manage
  doi: 10.1016/0306-4573(95)00052-I
  contributor:
    fullname: Brandow
– volume-title: Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004)
  year: 2004
  ident: 2023012512015777500_B23
  article-title: MEAD - a platform for multidocument multilingual text summarization
  contributor:
    fullname: Radev
– volume-title: Joensuu University Learning and Instruction Symposium 2005
  year: 2005
  ident: 2023012512015777500_B26
  article-title: Free-marginal multirater kappa: an alternative to Fleiss' fixed-marginal multirater kappa
  contributor:
    fullname: Randolph
– start-page: 406
  volume-title: Proceedings of the 24th Annual Iternational ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '01
  year: 2001
  ident: 2023012512015777500_B9
  article-title: Text summarization via hidden Markov models
  doi: 10.1145/383952.384042
  contributor:
    fullname: Conroy
– volume-title: Research Final Report
  year: 2010
  ident: 2023012512015777500_B13
  article-title: Multiple post microblog summarization
  contributor:
    fullname: Inouye
– volume: 10
  start-page: 46
  year: 2009
  ident: 2023012512015777500_B18
  article-title: Is searching full text more effective than searching abstracts?
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-10-46
  contributor:
    fullname: Lin
– start-page: 71
  volume-title: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1, NAACL '03
  year: 2003
  ident: 2023012512015777500_B19
  article-title: Automatic evaluation of summaries using N-gram co-occurrence statistics
  contributor:
    fullname: Lin
– start-page: 74
  volume-title: Text Summarization Branches Out: Proceedings of the ACL-04 Workshop
  year: 2004
  ident: 2023012512015777500_B17
  article-title: ROUGE: a package for automatic evaluation of summaries
  contributor:
    fullname: Lin
– volume: 7
  start-page: 220
  year: 2006
  ident: 2023012512015777500_B30
  article-title: Retrieval with gene queries
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-7-220
  contributor:
    fullname: Sehgal
– volume: 25
  start-page: 1412
  year: 2009
  ident: 2023012512015777500_B33
  article-title: MeSH Up: effective MeSH text classification for improved document retrieval
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btp249
  contributor:
    fullname: Trieschnigg
– volume-title: The SMART Retrieval System – Experiments in Automatic Document Processing.
  year: 1971
  ident: 2023012512015777500_B29
  contributor:
    fullname: Salton
– volume-title: TREC 2007 genomics track overview.
  year: 2007
  ident: 2023012512015777500_B11
  article-title: TREC 2007 genomics track overview
  contributor:
    fullname: Hersh
– volume-title: TREC 2006 genomics track overview.
  year: 2006
  ident: 2023012512015777500_B12
  article-title: TREC 2006 genomics track overview
  contributor:
    fullname: Hersh
– start-page: 76
  volume-title: Proceedings of the HLT-NAACL Workshop on Computational Lexical Semantics, CLS '04
  year: 2004
  ident: 2023012512015777500_B10
  article-title: Abstraction summarization for managing the biomedical research literature
  doi: 10.3115/1596431.1596442
  contributor:
    fullname: Fiszman
– volume: 24
  start-page: 470
  year: 1998
  ident: 2023012512015777500_B22
  article-title: Generating natural language summaries from multiple on-line sources
  publication-title: Comput. Linguist.
  contributor:
    fullname: Radev
– volume: 20
  start-page: i290–i296
  issue: l. 1
  year: 2004
  ident: 2023012512015777500_B32
  article-title: Mining MEDLINE for implicit links between dietary substances and diseases
  publication-title: Bioinformatics
  contributor:
    fullname: Srinivasan
– volume: 7
  start-page: 392
  year: 2006
  ident: 2023012512015777500_B6
  article-title: GeneLibrarian: an effective gene-information summarization and visualization system
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-7-392
  contributor:
    fullname: Chiang
– volume: 11
  start-page: 492
  year: 2010
  ident: 2023012512015777500_B8
  article-title: The structural and content aspects of abstracts versus bodies of full text journal articles are different
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-11-492
  contributor:
    fullname: Cohen
– start-page: 52
  volume-title: Proceedings of the BioCreative III workshop
  year: 2010
  ident: 2023012512015777500_B4
  article-title: Online gene indexing and retrieval for BioCreative III at the University of Iowa
  contributor:
    fullname: Bhattacharya
– volume: 1
  start-page: 389
  year: 2007
  ident: 2023012512015777500_B27
  article-title: Biomedical text summarisation using concept chains
  publication-title: Int. J. Data Min. Bioinform.
  doi: 10.1504/IJDMB.2007.012967
  contributor:
    fullname: Reeve
– volume: 55
  start-page: 396
  year: 2004
  ident: 2023012512015777500_B31
  article-title: Text mining: generating hypotheses from MEDLINE
  publication-title: J. Am. Soc. Inf. Sci. Technol.
  doi: 10.1002/asi.10389
  contributor:
    fullname: Srinivasan
– volume: 8
  start-page: S4
  issue: Suppl. 9
  year: 2007
  ident: 2023012512015777500_B34
  article-title: A coherent graph-based semantic clustering and summarization approach for biomedical literature and a new summarization evaluation method
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-8-S9-S4
  contributor:
    fullname: Yoo
– volume: 15
  start-page: 32
  year: 2008
  ident: 2023012512015777500_B7
  article-title: Five-way smoking status classification using text hot-spot identification and error-correcting output codes
  publication-title: J. Am. Med. Inform. Assoc.
  doi: 10.1197/jamia.M2434
  contributor:
    fullname: Cohen
– start-page: 97
  volume-title: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing, BioNLP '09
  year: 2009
  ident: 2023012512015777500_B14
  article-title: Towards automatic generation of gene summary
  contributor:
    fullname: Jin
– volume-title: Proceedings of the Document Understanding Conference.
  year: 2001
  ident: 2023012512015777500_B25
  article-title: Experiments in single and multidocument summarization using MEAD
  contributor:
    fullname: Radev
SSID ssj0051444
ssj0005056
Score 2.2715685
Snippet Motivation: Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE...
Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records. Recent...
MOTIVATIONPrevious research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records....
Motivation: Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE...
SourceID pubmedcentral
proquest
crossref
pubmed
oup
SourceType Open Access Repository
Aggregation Database
Index Database
Publisher
StartPage i120
SubjectTerms Information Storage and Retrieval
Medical Subject Headings
MEDLINE
Original Papers
United States
Title MeSH: a window into full text for document summarization
URI https://www.ncbi.nlm.nih.gov/pubmed/21685060
https://search.proquest.com/docview/873121778
https://pubmed.ncbi.nlm.nih.gov/PMC3117369
Volume 27
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8QwEB5WQfAivl1f5ODFQ7fNpo_EmyzKIqwKurC3kqQJLritaBfx3zvpY7GexEsvTUuYTJiZ5Pu-AbjAEKlDw7E2CY3xQoN7TlHLPMutTTItbKYcUXhyH4-n4d0smvUgarkwFWhfq_kgf10M8vlLha18W2i_xYn5j5MRozRhsfDXYA3Db1uit7iOoGrZ6qTIvJAHrKXtCOaredHIkToJZF-V7xgcKzng2Am3BZ3Y1OG7_Ug7f6Mnf4Sj223YavJIcl3Pdwd6Jt-Fjbqz5Nce8Il5Gl8RST6x5i4-yTwvC-KO2olDehCcHckKvXRHg6TmrzV8zH2Y3t48j8Ze0yTB02FES49iisXQGtZkUnFGTRK620GpWYa1gNBCBlI4_quVQxqqOLJDxaJEcSNkjNudHcB6XuTmCAjWqwGVKqbaChxpFOdRzBUmACLQVNE-DFrzpG-1FkZa32GztGvatDZtHy7RiH8dS1pTp-jh7tpC5qZYfqQ8YRQLp4T34bC2_OqP7cL1IemsyWqAE8_uvkGfqkS0Gx86_veXJ7BZny876O4prJfvS3OGCUqpziuHxOfzw-wbr7Lpeg
link.rule.ids 230,314,727,780,784,885,1604,27924,27925,53791,53793
linkProvider National Library of Medicine
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8QwEB50RfQivl2fOXjx0G1j-ki8iSjrY0VQwVtJ0gQXtBXtsvjvnfQh1pN4blrC1wkzk3zfF4BDTJE6NBx7k9AYLzS45hS1zLPc2iTTwmbKCYVHt_HwMbx6ip5mIGq1MBVpX6vxIH95HeTj54pb-faq_ZYn5t-NzhilCYuFPwtzEUsEbZv0ltkRVJe2OjMyL-QBa4U7gvlqXDSGpM4E2VflO6bHyhA4dtZtQSc7dRRvPwrP3_zJHwnpYhmWmkqSnNYzXoEZk6_CfH235Oca8JG5H54QSabYdRdTMs7LgrjNduK4HgRnR7JCT9zmIKkVbI0icx0eL84fzoZec02Cp8OIlh7FIoshHtZkUnFGTRK680GpWYbdgNBCBlI4BayVxzRUcWSPFYsSxY2QMS54tgG9vMjNFhDsWAMqVUy1FTjSKM6jmCssAUSgqaJ9GLTwpG-1G0Zan2KztAttWkPbhyME8a9jSQt1ijHuDi5kborJR8oTRrF1SngfNmvkv7_Y_rg-JJ1_8j3A2Wd3n2BUVTbaTRRt__vNA1gYPoxu0pvL2-sdWKx3mx2Rdxd65fvE7GG5Uqr9Kji_AJqt67M
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9wwEB5RUCsuCGgLKY_6wKWHbGKch80NAaul7a6QWiRuke3YIhIkq21Wq_77jvNYbXqqeo5jWZ_Hmhn7m28ALtBF6shwzE0iY_zI4JlT1DLfcmvTXAubK1coPJ0lk8fo61P8tNHqqyHta1WMypfXUVk8N9zK-asOep5Y8DC9YZSmLBHBPLfBG9iJGRpZn6j37I6wadzqBMn8iIesL94RLFBF1YmSOiHkQNULdJGNKHDi5NvCgYcaVL1tBJ9_cyg3nNJ4H_a6aJJct6s-gC1THsLbtr_k7_fAp-bH5IpIssLMu1qRoqwr4i7cieN7EFwdySu9dBeEpK1i66oyP8Dj-O7nzcTvWiX4Oopp7VMMtBhiYk0uFWfUpJF7I5Sa5ZgRCC1kKIWrgrXykkYqie2lYnGquBEywUPPPsJ2WZXmGAhmrSGVKqHaChxpFOdxwhWGASLUVFEPRj082bxVxMjal2yWDaHNWmg9-IIg_utY0kOdoZ27xwtZmmr5K-Mpo5g-pdyDoxb59Yz9xnmQDvZkPcBJaA-_oGU1UtqdJX367z8_w7uH23H2_X727QR22wtnx-U9he16sTRnGLHU6ryxzT_96uzG
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=MeSH%3A+a+window+into+full+text+for+document+summarization&rft.jtitle=Bioinformatics&rft.au=Bhattacharya%2C+Sanmitra&rft.au=Ha%E2%88%92Thuc%2C+Viet&rft.au=Srinivasan%2C+Padmini&rft.date=2011-07-01&rft.pub=Oxford+University+Press&rft.issn=1367-4803&rft.eissn=1460-2059&rft.volume=27&rft.issue=13&rft.spage=i120&rft.epage=i128&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbtr223&rft.externalDocID=10.1093%2Fbioinformatics%2Fbtr223
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4803&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4803&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4803&client=summon