Weakly supervised learning of biomedical information extraction from curated data

Numerous publicly available biomedical databases derive data by curating from literatures. The curated data can be useful as training examples for information extraction, but curated data usually lack the exact mentions and their locations in the text required for supervised machine learning. This p...

Full description

Saved in:
Bibliographic Details
Published inBMC bioinformatics Vol. 17; no. S1; p. 1
Main Authors Jain, Suvir, R., Kashyap, Kuo, Tsung-Ting, Bhargava, Shitij, Lin, Gordon, Hsu, Chun-Nan
Format Journal Article
LanguageEnglish
Published England BioMed Central Ltd 11.01.2016
BioMed Central
Subjects
Online AccessGet full text
ISSN1471-2105
1471-2105
DOI10.1186/s12859-015-0844-1

Cover

Loading…
Abstract Numerous publicly available biomedical databases derive data by curating from literatures. The curated data can be useful as training examples for information extraction, but curated data usually lack the exact mentions and their locations in the text required for supervised machine learning. This paper describes a general approach to information extraction using curated data as training examples. The idea is to formulate the problem as cost-sensitive learning from noisy labels, where the cost is estimated by a committee of weak classifiers that consider both curated data and the text. We test the idea on two information extraction tasks of Genome-Wide Association Studies (GWAS). The first task is to extract target phenotypes (diseases or traits) of a study and the second is to extract ethnicity backgrounds of study subjects for different stages (initial or replication). Experimental results show that our approach can achieve 87% of Precision-at-2 (P@2) for disease/trait extraction, and 0.83 of F1-Score for stage-ethnicity extraction, both outperforming their cost-insensitive baseline counterparts. The results show that curated biomedical databases can potentially be reused as training examples to train information extractors without expert annotation or refinement, opening an unprecedented opportunity of using "big data" in biomedical text mining.
AbstractList Numerous publicly available biomedical databases derive data by curating from literatures. The curated data can be useful as training examples for information extraction, but curated data usually lack the exact mentions and their locations in the text required for supervised machine learning. This paper describes a general approach to information extraction using curated data as training examples. The idea is to formulate the problem as cost-sensitive learning from noisy labels, where the cost is estimated by a committee of weak classifiers that consider both curated data and the text.BACKGROUNDNumerous publicly available biomedical databases derive data by curating from literatures. The curated data can be useful as training examples for information extraction, but curated data usually lack the exact mentions and their locations in the text required for supervised machine learning. This paper describes a general approach to information extraction using curated data as training examples. The idea is to formulate the problem as cost-sensitive learning from noisy labels, where the cost is estimated by a committee of weak classifiers that consider both curated data and the text.We test the idea on two information extraction tasks of Genome-Wide Association Studies (GWAS). The first task is to extract target phenotypes (diseases or traits) of a study and the second is to extract ethnicity backgrounds of study subjects for different stages (initial or replication). Experimental results show that our approach can achieve 87% of Precision-at-2 (P@2) for disease/trait extraction, and 0.83 of F1-Score for stage-ethnicity extraction, both outperforming their cost-insensitive baseline counterparts.RESULTSWe test the idea on two information extraction tasks of Genome-Wide Association Studies (GWAS). The first task is to extract target phenotypes (diseases or traits) of a study and the second is to extract ethnicity backgrounds of study subjects for different stages (initial or replication). Experimental results show that our approach can achieve 87% of Precision-at-2 (P@2) for disease/trait extraction, and 0.83 of F1-Score for stage-ethnicity extraction, both outperforming their cost-insensitive baseline counterparts.The results show that curated biomedical databases can potentially be reused as training examples to train information extractors without expert annotation or refinement, opening an unprecedented opportunity of using "big data" in biomedical text mining.CONCLUSIONSThe results show that curated biomedical databases can potentially be reused as training examples to train information extractors without expert annotation or refinement, opening an unprecedented opportunity of using "big data" in biomedical text mining.
Numerous publicly available biomedical databases derive data by curating from literatures. The curated data can be useful as training examples for information extraction, but curated data usually lack the exact mentions and their locations in the text required for supervised machine learning. This paper describes a general approach to information extraction using curated data as training examples. The idea is to formulate the problem as cost-sensitive learning from noisy labels, where the cost is estimated by a committee of weak classifiers that consider both curated data and the text. We test the idea on two information extraction tasks of Genome-Wide Association Studies (GWAS). The first task is to extract target phenotypes (diseases or traits) of a study and the second is to extract ethnicity backgrounds of study subjects for different stages (initial or replication). Experimental results show that our approach can achieve 87% of Precision-at-2 (P@2) for disease/trait extraction, and 0.83 of F1-Score for stage-ethnicity extraction, both outperforming their cost-insensitive baseline counterparts. The results show that curated biomedical databases can potentially be reused as training examples to train information extractors without expert annotation or refinement, opening an unprecedented opportunity of using "big data" in biomedical text mining.
ArticleNumber S1
Audience Academic
Author Hsu, Chun-Nan
Kuo, Tsung-Ting
Bhargava, Shitij
Jain, Suvir
R., Kashyap
Lin, Gordon
Author_xml – sequence: 1
  givenname: Suvir
  surname: Jain
  fullname: Jain, Suvir
– sequence: 2
  givenname: Kashyap
  surname: R.
  fullname: R., Kashyap
– sequence: 3
  givenname: Tsung-Ting
  surname: Kuo
  fullname: Kuo, Tsung-Ting
– sequence: 4
  givenname: Shitij
  surname: Bhargava
  fullname: Bhargava, Shitij
– sequence: 5
  givenname: Gordon
  surname: Lin
  fullname: Lin, Gordon
– sequence: 6
  givenname: Chun-Nan
  surname: Hsu
  fullname: Hsu, Chun-Nan
BackLink https://www.ncbi.nlm.nih.gov/pubmed/26817711$$D View this record in MEDLINE/PubMed
BookMark eNqNkktv1DAUhS1URB_wA9igSGxgkeKb-JUNUlXxqFQJ8RJLy7GdweDYg51U9N_jzJRqBqGCvPDrO8f29TlGByEGi9BjwKcAgr3I0Aja1RhojQUhNdxDR0A41A1gerAzPkTHOX_DGLjA9AE6bJgAzgGO0PsvVn3311We1zZduWxN5a1KwYVVFYeqd3G0xmnlKxeGmEY1uRgq-3NKSm-GQ4pjpeekpiI1alIP0f1B-Wwf3fQn6PPrV5_O39aX795cnJ9d1poxPtW0w4L2hHGtbMvavqeip6rj1AjMKAbS9ha3hhpoSI9bjS0vU0Ohw4Y0grQn6OXWdz335Y7ahnInL9fJjSpdy6ic3N8J7qtcxStJBOFE0GLw7MYgxR-zzZMcXdbWexVsnLMEzoCwljEo6NMtulLeyqUSy_sXXJ4RJgSIrsN3UwREQ8SGOv0LVZqxo9PlgwdX1vds_0-wc8LzPUFhpvJjKzXnLC8-ftg3_ye74_tkt-K3pf6dpgLAFtAp5pzscIsAlkti5TaxsiRWLomVi4b_odFu2qSsvNL5O5S_AIZW61I
CitedBy_id crossref_primary_10_1186_s12885_018_4894_4
crossref_primary_10_1002_smll_202203169
crossref_primary_10_1038_s41597_021_01078_4
crossref_primary_10_1038_s41598_019_38658_9
crossref_primary_10_1038_s41598_018_30455_0
crossref_primary_10_7717_peerj_14427
crossref_primary_10_1007_s13399_022_02804_7
crossref_primary_10_1007_s10462_023_10700_3
crossref_primary_10_1038_s41391_021_00403_7
crossref_primary_10_1038_s41598_021_04473_4
crossref_primary_10_1080_15476286_2017_1312243
crossref_primary_10_1038_s41598_017_16748_w
crossref_primary_10_1038_s41467_024_54821_x
crossref_primary_10_1038_s41598_018_38441_2
crossref_primary_10_1007_s42485_021_00077_8
crossref_primary_10_1016_j_patrec_2021_08_009
crossref_primary_10_1038_bcj_2016_81
crossref_primary_10_3389_frma_2021_683400
crossref_primary_10_1134_S0006297917110037
crossref_primary_10_1007_s11033_022_08145_y
crossref_primary_10_1093_bib_bbw112
crossref_primary_10_1038_s41597_020_0427_5
crossref_primary_10_1136_bmjdrc_2022_003068
crossref_primary_10_1142_S0129183124500141
crossref_primary_10_7717_peerj_13061
crossref_primary_10_1016_j_artmed_2023_102505
crossref_primary_10_1038_srep34323
crossref_primary_10_1186_s13073_021_00840_y
crossref_primary_10_1038_s41598_021_03334_4
crossref_primary_10_3390_informatics7040050
crossref_primary_10_1038_s41598_019_42694_w
crossref_primary_10_1007_s00438_021_01831_9
crossref_primary_10_1002_widm_1288
crossref_primary_10_1038_s41391_020_00311_2
crossref_primary_10_1038_s41467_019_11026_x
crossref_primary_10_1080_00949655_2024_2329976
crossref_primary_10_1177_18479790231222349
crossref_primary_10_1007_s00521_016_2680_2
crossref_primary_10_1039_C9RA05168F
Cites_doi 10.1186/1471-2105-10-326
10.1186/1471-2105-12-S8-S6
10.1093/bioinformatics/btm229
10.1186/1758-2946-2-3
10.1093/nar/gkh061
10.1108/00330330610681286
10.1186/1471-2105-12-S8-S1
10.1109/TNNLS.2013.2292894
10.1093/bioinformatics/btt333
10.1186/1471-2105-10-S15-S7
10.1613/jair.606
10.1186/gb-2008-9-s2-s3
10.1186/1471-2105-12-S8-S10
10.1093/database/bat080
10.1186/1471-2105-16-S5-S6
10.1016/j.jbi.2013.12.006
10.1186/1471-2105-12-S8-S4
10.1109/TMM.2011.2129498
10.1093/bioinformatics/btq099
10.1093/nar/gkt1229
10.1073/pnas.0903103106
10.1093/hmg/ddr302
10.1186/gb-2008-9-s2-s7
10.1186/1471-2105-13-172
10.1007/s10791-013-9219-2
10.1186/gb-2008-9-s2-s1
10.1109/TPAMI.2015.2456899
10.1093/bib/bbm045
10.1093/bioinformatics/btt474
ContentType Journal Article
Copyright COPYRIGHT 2016 BioMed Central Ltd.
Jain et al. 2015
Copyright_xml – notice: COPYRIGHT 2016 BioMed Central Ltd.
– notice: Jain et al. 2015
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
ISR
7X8
5PM
DOI 10.1186/s12859-015-0844-1
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Science in Context
MEDLINE - Academic
PubMed Central (Full Participant titles)
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic
MEDLINE
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1471-2105
ExternalDocumentID PMC4847485
A468818990
A441824890
26817711
10_1186_s12859_015_0844_1
Genre Journal Article
Research Support, N.I.H., Extramural
GeographicLocations New York
GeographicLocations_xml – name: New York
GrantInformation_xml – fundername: NHGRI NIH HHS
  grantid: U01HG006894
– fundername: NHGRI NIH HHS
  grantid: U01 HG006894
GroupedDBID ---
0R~
23N
2WC
4.4
53G
5VS
6J9
7X7
88E
8AO
8FE
8FG
8FH
8FI
8FJ
AAFWJ
AAJSJ
AAKPC
AASML
AAYXX
ABDBF
ABUWG
ACGFO
ACGFS
ACIHN
ACIWK
ACPRK
ACUHS
ADBBV
ADMLS
ADRAZ
ADUKV
AEAQA
AENEX
AEUYN
AFKRA
AFPKN
AFRAH
AHBYD
AHMBA
AHSBF
AHYZX
ALIPV
ALMA_UNASSIGNED_HOLDINGS
AMKLP
AMTXH
AOIJS
ARAPS
AZQEC
BAPOH
BAWUL
BBNVY
BCNDV
BENPR
BFQNJ
BGLVJ
BHPHI
BMC
BPHCQ
BVXVI
C6C
CCPQU
CITATION
CS3
DIK
DU5
DWQXO
E3Z
EAD
EAP
EAS
EBD
EBLON
EBS
EJD
EMB
EMK
EMOBN
ESX
F5P
FYUFA
GNUQQ
GROUPED_DOAJ
GX1
H13
HCIFZ
HMCUK
HYE
IAO
ICD
IHR
INH
INR
ISR
ITC
K6V
K7-
KQ8
LK8
M1P
M48
M7P
MK~
ML0
M~E
O5R
O5S
OK1
OVT
P2P
P62
PGMZT
PHGZM
PHGZT
PIMPY
PQQKQ
PROAC
PSQYO
RBZ
RNS
ROL
RPM
RSV
SBL
SOJ
SV3
TR2
TUS
UKHRP
W2D
WOQ
WOW
XH6
XSB
CGR
CUY
CVF
ECM
EIF
NPM
PJZUB
PPXIY
PQGLB
PMFND
7X8
5PM
ID FETCH-LOGICAL-c667t-59085b467cae363bb58b5a975d80650143be03d5d124b03c0e703dd5190d42843
IEDL.DBID M48
ISSN 1471-2105
IngestDate Thu Aug 21 18:17:44 EDT 2025
Fri Jul 11 05:54:50 EDT 2025
Tue Jun 17 22:05:03 EDT 2025
Tue Jun 17 22:07:29 EDT 2025
Tue Jun 10 21:05:11 EDT 2025
Tue Jun 10 21:10:33 EDT 2025
Fri Jun 27 05:47:23 EDT 2025
Fri Jun 27 06:08:23 EDT 2025
Mon Jul 21 05:48:00 EDT 2025
Tue Jul 01 03:38:21 EDT 2025
Thu Apr 24 22:51:19 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue S1
Language English
License Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c667t-59085b467cae363bb58b5a975d80650143be03d5d124b03c0e703dd5190d42843
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink http://journals.scholarsportal.info/openUrl.xqy?doi=10.1186/s12859-015-0844-1
PMID 26817711
PQID 1761463661
PQPubID 23479
ParticipantIDs pubmedcentral_primary_oai_pubmedcentral_nih_gov_4847485
proquest_miscellaneous_1761463661
gale_infotracmisc_A468818990
gale_infotracmisc_A441824890
gale_infotracacademiconefile_A468818990
gale_infotracacademiconefile_A441824890
gale_incontextgauss_ISR_A468818990
gale_incontextgauss_ISR_A441824890
pubmed_primary_26817711
crossref_primary_10_1186_s12859_015_0844_1
crossref_citationtrail_10_1186_s12859_015_0844_1
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2016-01-11
2016-Jan-11
20160111
PublicationDateYYYYMMDD 2016-01-11
PublicationDate_xml – month: 01
  year: 2016
  text: 2016-01-11
  day: 11
PublicationDecade 2010
PublicationPlace England
PublicationPlace_xml – name: England
– name: London
PublicationTitle BMC bioinformatics
PublicationTitleAlternate BMC Bioinformatics
PublicationYear 2016
Publisher BioMed Central Ltd
BioMed Central
Publisher_xml – name: BioMed Central Ltd
– name: BioMed Central
References CD Manning (844_CR1) 1999
F Zhu (844_CR17) 2013; 46
E Png (844_CR54) 2011; 20
M Krallinger (844_CR14) 2008; 9
844_CR21
B Mons (844_CR6) 2005
YZ Koh (844_CR18) 2014
CE Brodley (844_CR35) 1999; 11
RA Servedio (844_CR25) 2003; 4
JD Burger (844_CR7) 2014
X Chang (844_CR41) 2009
A Constantin (844_CR47) 2013
RII Doğan (844_CR38) 2014; 47
844_CR39
LA Hindorff (844_CR20) 2009; 106
J Whitehill (844_CR34) 2009
R Snow (844_CR9) 2008
M Simpson (844_CR16) 2012
O Bodenreider (844_CR49) 2004; 32
VS Sheng (844_CR23) 2008
B Frénay (844_CR24) 2014; 25
J Malone (844_CR44) 2010; 26
S Agarwal (844_CR32) 2011; 12
844_CR48
VC Raykar (844_CR33) 2010; 11
844_CR43
844_CR45
TC Wiegers (844_CR2) 2009; 10
CJ Kuo (844_CR52) 2009; 10
RB Altman (844_CR4) 2008; 9
A Morgan (844_CR28) 2008; 9
WA Baumgartner (844_CR10) 2007; 23
C Arighi (844_CR31) 2011; 12
N Natarajan (844_CR22) 2013
R Xu (844_CR13) 2015; 16
HY Lo (844_CR42) 2011; 13
S Bhargava (844_CR46) 2015
AP Davis (844_CR3) 2013; 2013
A Kalai (844_CR26) 2009
D Welter (844_CR19) 2014; 42
C Arighi (844_CR29) 2011; 12
K Hettne (844_CR5) 2010; 2
J Czarnecki (844_CR12) 2012; 13
844_CR50
844_CR51
M Chowdhury (844_CR53) 2010
BM Good (844_CR8) 2013; 29
C Bouveyron (844_CR27) 2009
R Leaman (844_CR37) 2014
S Kim (844_CR11) 2015
P Zweigenbaum (844_CR15) 2007; 8
CJ Kuo (844_CR30) 2011; 12
YX Ruan (844_CR40) 2014; 14
R Leaman (844_CR36) 2013; 29
24288140 - Database (Oxford). 2013;2013:bat080
24393765 - J Biomed Inform. 2014 Feb;47:1-10
14681409 - Nucleic Acids Res. 2004 Jan 1;32(Database issue):D267-70
21764829 - Hum Mol Genet. 2011 Oct 1;20(19):3893-8
23969135 - Bioinformatics. 2013 Nov 15;29(22):2909-17
25887671 - BMC Bioinformatics. 2015;16:57
19958517 - BMC Bioinformatics. 2009;10 Suppl 15:S7
17646325 - Bioinformatics. 2007 Jul 1;23(13):i41-8
22151968 - BMC Bioinformatics. 2011;12 Suppl 8:S4
22151701 - BMC Bioinformatics. 2011;12 Suppl 8:S10
19814812 - BMC Bioinformatics. 2009;10:326
25246425 - Database (Oxford). 2014;2014. pii: bau094. doi: 10.1093/database/bau094
27046490 - IEEE Trans Pattern Anal Mach Intell. 2016 Mar;38(3):447-61
22152021 - BMC Bioinformatics. 2011;12 Suppl 8:S6
26868016 - BMC Bioinformatics. 2016;17:84
22151647 - BMC Bioinformatics. 2011;12 Suppl 8:S1
24316577 - Nucleic Acids Res. 2014 Jan;42(Database issue):D1001-6
18834494 - Genome Biol. 2008;9 Suppl 2:S3
20331846 - J Cheminform. 2010 Mar 23;2(1):3
18834498 - Genome Biol. 2008;9 Suppl 2:S7
20200009 - Bioinformatics. 2010 Apr 15;26(8):1112-8
17977867 - Brief Bioinform. 2007 Sep;8(5):358-75
22823282 - BMC Bioinformatics. 2012;13:172
15941477 - BMC Bioinformatics. 2005;6:142
23159498 - J Biomed Inform. 2013 Apr;46(2):200-11
24808033 - IEEE Trans Neural Netw Learn Syst. 2014 May;25(5):845-69
25860223 - BMC Bioinformatics. 2015;16 Suppl 5:S6
19474294 - Proc Natl Acad Sci U S A. 2009 Jun 9;106(23):9362-7
18834487 - Genome Biol. 2008;9 Suppl 2:S1
23782614 - Bioinformatics. 2013 Aug 15;29(16):1925-33
References_xml – volume: 10
  start-page: 326
  issue: 1
  year: 2009
  ident: 844_CR2
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-10-326
– volume: 12
  start-page: 6
  issue: Suppl 8
  year: 2011
  ident: 844_CR30
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-12-S8-S6
– volume: 23
  start-page: 41
  issue: 13
  year: 2007
  ident: 844_CR10
  publication-title: Bioinformatics (Oxford, England)
  doi: 10.1093/bioinformatics/btm229
– volume-title: Proceedings of BioNLP 2014
  year: 2014
  ident: 844_CR37
– volume: 2
  start-page: 3
  issue: 1
  year: 2010
  ident: 844_CR5
  publication-title: J Cheminformatics
  doi: 10.1186/1758-2946-2-3
– volume: 11
  start-page: 1297
  year: 2010
  ident: 844_CR33
  publication-title: J Mach Learn Res
– volume: 32
  start-page: 267
  issue: suppl 1
  year: 2004
  ident: 844_CR49
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkh061
– volume-title: BMC Bioinformatics
  year: 2015
  ident: 844_CR11
– ident: 844_CR51
  doi: 10.1108/00330330610681286
– volume: 12
  start-page: 1
  issue: Suppl 8
  year: 2011
  ident: 844_CR31
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-12-S8-S1
– volume-title: Proceedings of the 2013 ACM symposium on document engineering, DocEng ’13
  year: 2013
  ident: 844_CR47
– volume-title: Mining text data
  year: 2012
  ident: 844_CR16
– volume: 25
  start-page: 845
  issue: 5
  year: 2014
  ident: 844_CR24
  publication-title: IEEE Trans Neural Netw Learn Syst
  doi: 10.1109/TNNLS.2013.2292894
– volume: 29
  start-page: 1925
  issue: 16
  year: 2013
  ident: 844_CR8
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btt333
– volume: 10
  start-page: 7
  issue: Suppl 15
  year: 2009
  ident: 844_CR52
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-10-S15-S7
– volume-title: Advances in Neural Information Processing Systems 22
  year: 2009
  ident: 844_CR34
– volume: 11
  start-page: 131
  year: 1999
  ident: 844_CR35
  publication-title: J Artif Intell Res
  doi: 10.1613/jair.606
– volume: 9
  start-page: 3
  issue: Suppl 2
  year: 2008
  ident: 844_CR28
  publication-title: Genome Biol
  doi: 10.1186/gb-2008-9-s2-s3
– volume-title: Preparing PDF scientific articles for biomedical text mining
  year: 2015
  ident: 844_CR46
– volume: 12
  start-page: 10
  issue: Suppl 8
  year: 2011
  ident: 844_CR32
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-12-S8-S10
– ident: 844_CR45
– volume: 2013
  start-page: 080
  year: 2013
  ident: 844_CR3
  publication-title: Database
  doi: 10.1093/database/bat080
– volume: 16
  start-page: 6
  issue: Suppl 5
  year: 2015
  ident: 844_CR13
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-16-S5-S6
– volume: 47
  start-page: 1
  year: 2014
  ident: 844_CR38
  publication-title: J Biomed Inform
  doi: 10.1016/j.jbi.2013.12.006
– volume-title: Bio-Inspired Systems: Computational and Ambient Intelligence 10th International Work-Conference on Artificial Neural Networks, IWANN 2009, Salamanca, Spain, June 10-12, 2009. Proceedings, Part I
  year: 2009
  ident: 844_CR27
– volume: 12
  start-page: 4
  issue: Suppl 8
  year: 2011
  ident: 844_CR29
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-12-S8-S4
– volume: 13
  start-page: 518
  issue: 3
  year: 2011
  ident: 844_CR42
  publication-title: Multimedia IEEE Trans
  doi: 10.1109/TMM.2011.2129498
– volume: 26
  start-page: 1112
  issue: 8
  year: 2010
  ident: 844_CR44
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btq099
– volume-title: Proceedings of the intelligent computing 5th international conference on emerging intelligent computing technology and applications, ICIC’09
  year: 2009
  ident: 844_CR41
– ident: 844_CR39
– volume-title: Advances in neural information processing systems 26
  year: 2013
  ident: 844_CR22
– volume-title: Database: J Biol Databases Curation
  year: 2014
  ident: 844_CR7
– volume: 42
  start-page: 1001
  issue: Database issue
  year: 2014
  ident: 844_CR19
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkt1229
– volume: 4
  start-page: 633
  year: 2003
  ident: 844_CR25
  publication-title: J Mach Learn Res
– ident: 844_CR48
– volume: 106
  start-page: 9362
  issue: 23
  year: 2009
  ident: 844_CR20
  publication-title: Proc Natl Acad Sci
  doi: 10.1073/pnas.0903103106
– volume: 20
  start-page: 3893
  issue: 19
  year: 2011
  ident: 844_CR54
  publication-title: Hum Mol Genet
  doi: 10.1093/hmg/ddr302
– volume-title: BMC Bioinformatics
  year: 2005
  ident: 844_CR6
– volume: 46
  start-page: 200
  issue: 2
  year: 2013
  ident: 844_CR17
  publication-title: J Biomed Eng
– volume: 9
  start-page: 7
  issue: Suppl 2
  year: 2008
  ident: 844_CR4
  publication-title: Genome Biol
  doi: 10.1186/gb-2008-9-s2-s7
– volume-title: Proceedings of the 2008 conference on empirical methods in natural language processing
  year: 2008
  ident: 844_CR9
– volume-title: Proceedings of the 14th ACM SIGKDD International conference on knowledge discovery and data mining, KDD ’08
  year: 2008
  ident: 844_CR23
– volume: 13
  start-page: 172
  issue: 1
  year: 2012
  ident: 844_CR12
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-13-172
– volume: 14
  start-page: 1
  issue: 1
  year: 2014
  ident: 844_CR40
  publication-title: Inf Retr
  doi: 10.1007/s10791-013-9219-2
– volume-title: Proceedings of the 2010 workshop on biomedical natural language processing
  year: 2010
  ident: 844_CR53
– volume: 9
  start-page: 1
  issue: Suppl 2
  year: 2008
  ident: 844_CR14
  publication-title: Genome Biol
  doi: 10.1186/gb-2008-9-s2-s1
– ident: 844_CR43
– volume-title: Foundations of Statistical Natural Language Processing
  year: 1999
  ident: 844_CR1
– ident: 844_CR21
  doi: 10.1109/TPAMI.2015.2456899
– volume: 8
  start-page: 358
  issue: 5
  year: 2007
  ident: 844_CR15
  publication-title: Brief Bioinform
  doi: 10.1093/bib/bbm045
– volume-title: Comput Math Biol
  year: 2014
  ident: 844_CR18
– volume: 29
  start-page: 2909
  issue: 22
  year: 2013
  ident: 844_CR36
  publication-title: Bioinformatics (Oxford, England)
  doi: 10.1093/bioinformatics/btt474
– ident: 844_CR50
– volume-title: Advances in neural information processing systems 22
  year: 2009
  ident: 844_CR26
– reference: 26868016 - BMC Bioinformatics. 2016;17:84
– reference: 22823282 - BMC Bioinformatics. 2012;13:172
– reference: 17646325 - Bioinformatics. 2007 Jul 1;23(13):i41-8
– reference: 15941477 - BMC Bioinformatics. 2005;6:142
– reference: 21764829 - Hum Mol Genet. 2011 Oct 1;20(19):3893-8
– reference: 17977867 - Brief Bioinform. 2007 Sep;8(5):358-75
– reference: 20331846 - J Cheminform. 2010 Mar 23;2(1):3
– reference: 25246425 - Database (Oxford). 2014;2014. pii: bau094. doi: 10.1093/database/bau094
– reference: 25860223 - BMC Bioinformatics. 2015;16 Suppl 5:S6
– reference: 19814812 - BMC Bioinformatics. 2009;10:326
– reference: 23782614 - Bioinformatics. 2013 Aug 15;29(16):1925-33
– reference: 18834487 - Genome Biol. 2008;9 Suppl 2:S1
– reference: 23969135 - Bioinformatics. 2013 Nov 15;29(22):2909-17
– reference: 14681409 - Nucleic Acids Res. 2004 Jan 1;32(Database issue):D267-70
– reference: 20200009 - Bioinformatics. 2010 Apr 15;26(8):1112-8
– reference: 19958517 - BMC Bioinformatics. 2009;10 Suppl 15:S7
– reference: 19474294 - Proc Natl Acad Sci U S A. 2009 Jun 9;106(23):9362-7
– reference: 23159498 - J Biomed Inform. 2013 Apr;46(2):200-11
– reference: 22151647 - BMC Bioinformatics. 2011;12 Suppl 8:S1
– reference: 22152021 - BMC Bioinformatics. 2011;12 Suppl 8:S6
– reference: 24288140 - Database (Oxford). 2013;2013:bat080
– reference: 24316577 - Nucleic Acids Res. 2014 Jan;42(Database issue):D1001-6
– reference: 22151968 - BMC Bioinformatics. 2011;12 Suppl 8:S4
– reference: 24808033 - IEEE Trans Neural Netw Learn Syst. 2014 May;25(5):845-69
– reference: 18834498 - Genome Biol. 2008;9 Suppl 2:S7
– reference: 25887671 - BMC Bioinformatics. 2015;16:57
– reference: 24393765 - J Biomed Inform. 2014 Feb;47:1-10
– reference: 22151701 - BMC Bioinformatics. 2011;12 Suppl 8:S10
– reference: 27046490 - IEEE Trans Pattern Anal Mach Intell. 2016 Mar;38(3):447-61
– reference: 18834494 - Genome Biol. 2008;9 Suppl 2:S3
SSID ssj0017805
Score 2.2657168
Snippet Numerous publicly available biomedical databases derive data by curating from literatures. The curated data can be useful as training examples for information...
SourceID pubmedcentral
proquest
gale
pubmed
crossref
SourceType Open Access Repository
Aggregation Database
Index Database
Enrichment Source
StartPage 1
SubjectTerms Abstracting and Indexing as Topic - methods
Computational linguistics
Data Curation
Data mining
Data Mining - methods
Databases, Factual
Disease - genetics
Genetic Predisposition to Disease
Genome-Wide Association Study
Genomics
Humans
Language processing
Machine learning
Natural language interfaces
Proceedings
Risk Assessment
Title Weakly supervised learning of biomedical information extraction from curated data
URI https://www.ncbi.nlm.nih.gov/pubmed/26817711
https://www.proquest.com/docview/1761463661
https://pubmed.ncbi.nlm.nih.gov/PMC4847485
Volume 17
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3db9MwELfGJiReEONrha0yCAkJKRA3_soDQtu0MpA2waCib1bsOANRpaNpJPrfc5e47YKqIV7y4oul3Nl3v4t9vyPkRZF6JPKykVMxluRYG9lUZhHPpE_8IHaCY6Hw2bk8HfGPYzHeIsv2VkGB1cbUDvtJjWaT179_Ld7Bhn_bbHgt31QMWdggKRbIUsojSIZ2IDAp7ORwxteHCkjf3xQbKRZBpiPCIefGKTph6m9nfS1adW9SXgtNw3vkbsCU9LBdBLtky5f3ye22y-TiAfn8zWc_Jwta1VfoGCqf09Ar4pJOC9oW4KOtaGBRRVtRcNqztuiBYgkKdTWSSuQUr5Q-JKPhydfj0yh0UoiclGoeYWNzYcEnuswnEgwitBVZqkSO56pI8Wd9nOQih2hv48TFHhxBngO6i3PIT3jyiGyX09LvEVpIqwtZpNZLD1BLQhY6sEwoSNycclz2SLxUnHGBZhy7XUxMk25oaVpdG9C1QV0b1iOvVq9ctRwbNwk_R2sY5K4o8XLMZVZXlfnw5cIcArTTA67T-EYhqQGkpCj0MggVU9RoFgoS4DuRE6sz3b8k13PudyRhr7rORBuH128_Wy4xg0N4_63007oyTAGMkgmAqR553C65laoGUjOlGIyozmJcCSCDeHek_PG9YRLngE24Fk_-xwBPyR0Ajc1vKMb2yfZ8VvsDAGZz2ye31FjBUw_f98nO0cn5p4t-85Oj32zEP8CANIU
linkProvider Scholars Portal
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Weakly+supervised+learning+of+biomedical+information+extraction+from+curated+data&rft.jtitle=BMC+bioinformatics&rft.au=Jain%2C+Suvir&rft.au=R.%2C+Kashyap&rft.au=Kuo%2C+Tsung-Ting&rft.au=Bhargava%2C+Shitij&rft.date=2016-01-11&rft.issn=1471-2105&rft.eissn=1471-2105&rft.volume=17&rft.issue=S1&rft_id=info:doi/10.1186%2Fs12859-015-0844-1&rft.externalDBID=n%2Fa&rft.externalDocID=10_1186_s12859_015_0844_1
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1471-2105&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1471-2105&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1471-2105&client=summon