A review of biomedical datasets relating to drug discovery: a knowledge graph perspective

Drug discovery and development is a complex and costly process. Machine learning approaches are being investigated to help improve the effectiveness and speed of multiple stages of the drug discovery pipeline. Of these, those that use Knowledge Graphs (KG) have promise in many tasks, including drug...

Full description

Saved in:
Bibliographic Details
Published inBriefings in bioinformatics Vol. 23; no. 6
Main Authors Bonner, Stephen, Barrett, Ian P, Ye, Cheng, Swiers, Rowan, Engkvist, Ola, Bender, Andreas, Hoyt, Charles Tapley, Hamilton, William L
Format Journal Article
LanguageEnglish
Published England 19.11.2022
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Drug discovery and development is a complex and costly process. Machine learning approaches are being investigated to help improve the effectiveness and speed of multiple stages of the drug discovery pipeline. Of these, those that use Knowledge Graphs (KG) have promise in many tasks, including drug repurposing, drug toxicity prediction and target gene–disease prioritization. In a drug discovery KG, crucial elements including genes, diseases and drugs are represented as entities, while relationships between them indicate an interaction. However, to construct high-quality KGs, suitable data are required. In this review, we detail publicly available sources suitable for use in constructing drug discovery focused KGs. We aim to help guide machine learning and KG practitioners who are interested in applying new techniques to the drug discovery field, but who may be unfamiliar with the relevant data sources. The datasets are selected via strict criteria, categorized according to the primary type of information contained within and are considered based upon what information could be extracted to build a KG. We then present a comparative analysis of existing public drug discovery KGs and an evaluation of selected motivating case studies from the literature. Additionally, we raise numerous and unique challenges and issues associated with the domain and its datasets, while also highlighting key future research directions. We hope this review will motivate KGs use in solving key and emerging questions in the drug discovery domain.
AbstractList Drug discovery and development is a complex and costly process. Machine learning approaches are being investigated to help improve the effectiveness and speed of multiple stages of the drug discovery pipeline. Of these, those that use Knowledge Graphs (KG) have promise in many tasks, including drug repurposing, drug toxicity prediction and target gene-disease prioritization. In a drug discovery KG, crucial elements including genes, diseases and drugs are represented as entities, while relationships between them indicate an interaction. However, to construct high-quality KGs, suitable data are required. In this review, we detail publicly available sources suitable for use in constructing drug discovery focused KGs. We aim to help guide machine learning and KG practitioners who are interested in applying new techniques to the drug discovery field, but who may be unfamiliar with the relevant data sources. The datasets are selected via strict criteria, categorized according to the primary type of information contained within and are considered based upon what information could be extracted to build a KG. We then present a comparative analysis of existing public drug discovery KGs and an evaluation of selected motivating case studies from the literature. Additionally, we raise numerous and unique challenges and issues associated with the domain and its datasets, while also highlighting key future research directions. We hope this review will motivate KGs use in solving key and emerging questions in the drug discovery domain.Drug discovery and development is a complex and costly process. Machine learning approaches are being investigated to help improve the effectiveness and speed of multiple stages of the drug discovery pipeline. Of these, those that use Knowledge Graphs (KG) have promise in many tasks, including drug repurposing, drug toxicity prediction and target gene-disease prioritization. In a drug discovery KG, crucial elements including genes, diseases and drugs are represented as entities, while relationships between them indicate an interaction. However, to construct high-quality KGs, suitable data are required. In this review, we detail publicly available sources suitable for use in constructing drug discovery focused KGs. We aim to help guide machine learning and KG practitioners who are interested in applying new techniques to the drug discovery field, but who may be unfamiliar with the relevant data sources. The datasets are selected via strict criteria, categorized according to the primary type of information contained within and are considered based upon what information could be extracted to build a KG. We then present a comparative analysis of existing public drug discovery KGs and an evaluation of selected motivating case studies from the literature. Additionally, we raise numerous and unique challenges and issues associated with the domain and its datasets, while also highlighting key future research directions. We hope this review will motivate KGs use in solving key and emerging questions in the drug discovery domain.
Drug discovery and development is a complex and costly process. Machine learning approaches are being investigated to help improve the effectiveness and speed of multiple stages of the drug discovery pipeline. Of these, those that use Knowledge Graphs (KG) have promise in many tasks, including drug repurposing, drug toxicity prediction and target gene–disease prioritization. In a drug discovery KG, crucial elements including genes, diseases and drugs are represented as entities, while relationships between them indicate an interaction. However, to construct high-quality KGs, suitable data are required. In this review, we detail publicly available sources suitable for use in constructing drug discovery focused KGs. We aim to help guide machine learning and KG practitioners who are interested in applying new techniques to the drug discovery field, but who may be unfamiliar with the relevant data sources. The datasets are selected via strict criteria, categorized according to the primary type of information contained within and are considered based upon what information could be extracted to build a KG. We then present a comparative analysis of existing public drug discovery KGs and an evaluation of selected motivating case studies from the literature. Additionally, we raise numerous and unique challenges and issues associated with the domain and its datasets, while also highlighting key future research directions. We hope this review will motivate KGs use in solving key and emerging questions in the drug discovery domain.
Author Engkvist, Ola
Hoyt, Charles Tapley
Hamilton, William L
Barrett, Ian P
Ye, Cheng
Bonner, Stephen
Swiers, Rowan
Bender, Andreas
Author_xml – sequence: 1
  givenname: Stephen
  surname: Bonner
  fullname: Bonner, Stephen
– sequence: 2
  givenname: Ian P
  surname: Barrett
  fullname: Barrett, Ian P
– sequence: 3
  givenname: Cheng
  surname: Ye
  fullname: Ye, Cheng
– sequence: 4
  givenname: Rowan
  surname: Swiers
  fullname: Swiers, Rowan
– sequence: 5
  givenname: Ola
  surname: Engkvist
  fullname: Engkvist, Ola
– sequence: 6
  givenname: Andreas
  surname: Bender
  fullname: Bender, Andreas
– sequence: 7
  givenname: Charles Tapley
  surname: Hoyt
  fullname: Hoyt, Charles Tapley
– sequence: 8
  givenname: William L
  surname: Hamilton
  fullname: Hamilton, William L
BackLink https://www.ncbi.nlm.nih.gov/pubmed/36151740$$D View this record in MEDLINE/PubMed
https://research.chalmers.se/publication/532491$$DView record from Swedish Publication Index
BookMark eNptkc2LFDEQxYOsuB968i45CtJu0kl30t6WxS9Y8KAePIVKUpmJ9nTapHuG_e_NMrMexFMF6lePl_cuydmUJiTkJWdvORvEtY322lpwkskn5IJLpRrJOnn28O5V08lenJPLUn4y1jKl-TNyLnrecSXZBflxQzPuIx5oCtTGtEMfHYzUwwIFl1K3Iyxx2tAlUZ_XDfWxuLTHfP-OAv01pcOIfoN0k2He0hlzmdEtcY_PydMAY8EXp3lFvn94_-32U3P35ePn25u7phpul4ajCJoF1D54az2C6qztkEGv-4H3IQwyMO-VcMgH3YLsddsqIfjgeKu5Elfk61G3HHBerZlz3EG-NwmiyVgQstsat4VxV72ZgoZpBypINLJl2kjXDWaw0hkOWkqUNT_pq-rro-qc0-8Vy2J29d84jjBhWotpFVe9lkKLir46oaut8f018BhyBfgRcDmVkjEYF5caapqWDHE0nJmHIk0t0pyKrDdv_rl5lP0f_QeDp6DK
CitedBy_id crossref_primary_10_1038_s43856_024_00486_y
crossref_primary_10_1124_pharmrev_122_000715
crossref_primary_10_1038_s41597_022_01807_3
crossref_primary_10_1371_journal_pcbi_1009909
crossref_primary_10_1109_TCBB_2024_3477410
crossref_primary_10_1038_s41587_023_01848_y
crossref_primary_10_1145_3672615
crossref_primary_10_1093_bib_bbad235
crossref_primary_10_36937_ben_2023_4798
crossref_primary_10_1016_j_ymeth_2024_08_009
crossref_primary_10_1016_j_isci_2024_109509
crossref_primary_10_1093_bib_bbac481
crossref_primary_10_1016_j_knosys_2024_111492
crossref_primary_10_1145_3699713
crossref_primary_10_52679_tabcj_2021_0007
crossref_primary_10_1016_j_chempr_2023_12_018
crossref_primary_10_1016_j_xgen_2024_100655
crossref_primary_10_1186_s12915_024_02049_y
crossref_primary_10_1021_acs_biochem_4c00501
crossref_primary_10_3390_app15052798
crossref_primary_10_1016_j_phrs_2023_106960
crossref_primary_10_1093_bib_bbae035
crossref_primary_10_56294_gr2025107
crossref_primary_10_3390_ijms26020477
crossref_primary_10_1177_14727978251321402
crossref_primary_10_1186_s12859_024_05812_8
crossref_primary_10_30895_2312_7821_2023_11_4_372_389
crossref_primary_10_1007_s10462_023_10413_7
crossref_primary_10_1016_j_crmeth_2023_100413
crossref_primary_10_1021_acs_jcim_3c01726
crossref_primary_10_3233_SW_243685
crossref_primary_10_1002_advs_202412402
crossref_primary_10_1002_prp2_70034
crossref_primary_10_1038_s41598_024_60004_x
crossref_primary_10_1093_gigascience_giad057
crossref_primary_10_1109_TKDE_2024_3421933
crossref_primary_10_1016_j_ailsci_2022_100036
crossref_primary_10_1186_s12859_022_04934_1
crossref_primary_10_1186_s44149_023_00106_7
crossref_primary_10_1007_s11030_025_11164_z
crossref_primary_10_1038_s41597_023_02757_0
crossref_primary_10_1016_j_clinthera_2024_03_006
Cites_doi 10.1038/nrd.2017.244
10.1093/bioinformatics/btt765
10.1016/j.ymeth.2014.11.020
10.1093/nar/gky1126
10.1093/nar/gkm958
10.1093/nar/gkh036
10.1016/S0165-6147(00)01584-4
10.1016/j.jbi.2008.03.004
10.1093/nar/gky1075
10.1145/3447772
10.1093/bioinformatics/btaa274
10.1093/nar/gky1105
10.1093/nar/gkw1092
10.1093/nar/gkh131
10.1038/nrd.2018.14
10.1093/bioinformatics/btq099
10.1093/bioinformatics/btt549
10.1007/978-3-031-01588-5
10.2174/1386207013330670
10.1145/3434185
10.1093/bib/bbab159
10.1038/clpt.2012.96
10.1145/3292500.3330961
10.1145/3317287.3328534
10.1093/nar/gky1133
10.1177/1460458220937101
10.3389/fgene.2019.01203
10.1109/ICDE.2019.00061
10.1093/nar/gkz1161
10.1038/nrg2918
10.1145/3437963.3441663
10.1093/nar/gky1079
10.1093/nar/30.1.412
10.1093/nar/gku1267
10.1016/j.ajhg.2008.09.017
10.1021/acs.jcim.8b00663
10.1007/978-3-319-93417-4_38
10.1136/bmjhci-2020-100254
10.1038/s41592-019-0509-5
10.1093/bib/bbp002
10.1093/bib/bbx169
10.1093/database/bav028
10.1093/nar/gkv1075
10.18653/v1/W15-4007
10.1038/sdata.2017.29
10.7554/eLife.26726
10.1093/nar/gkp896
10.1002/(SICI)1098-1004(200001)15:1<57::AID-HUMU12>3.0.CO;2-G
10.1093/nar/gkv1277
10.1093/nar/gkx1064
10.1093/nar/gkm882
10.1016/j.ygeno.2019.06.021
10.1093/nar/gkv951
10.1038/s41467-019-11069-0
10.3389/fgene.2019.01381
10.1093/nar/gkw1055
10.1093/nar/gkj109
10.1093/nar/gky1131
10.1093/bioinformatics/btz682
10.1093/bib/bbaa344
10.1073/pnas.2016239118
10.1093/nar/gky868
10.1145/3340531.3412776
10.1038/s41598-020-74922-z
10.1093/bioinformatics/bty114
10.1038/nmeth.4077
10.1093/bioinformatics/btz600
10.1126/scitranslmed.3003377
10.1093/nar/gky1206
10.1093/nar/gkx1037
10.1038/s41586-021-03819-2
10.1093/nar/gky1032
10.1038/sdata.2016.18
10.1093/nar/gky1120
10.1093/bib/bbz017
10.1109/CVPR.2009.5206848
10.1186/s12859-019-3284-5
10.1093/nar/gkm883
10.12688/f1000research.9656.1
10.1093/nar/gkw1072
10.3390/molecules23092208
10.1038/s41573-019-0024-5
10.1093/nar/gkh052
10.1109/TPAMI.2013.50
10.1093/bib/bbm059
10.1093/bioinformatics/bty294
ContentType Journal Article
Copyright The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Copyright_xml – notice: The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7X8
ABBSD
ADTPV
AOWAS
D8T
F1S
ZZAVC
DOI 10.1093/bib/bbac404
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
SWEPUB Chalmers tekniska högskola full text
SwePub
SwePub Articles
SWEPUB Freely available online
SWEPUB Chalmers tekniska högskola
SwePub Articles full text
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic
CrossRef

MEDLINE
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1477-4054
ExternalDocumentID oai_research_chalmers_se_08ca7f4e_4208_4c59_9b4c_1a844e44774d
36151740
10_1093_bib_bbac404
Genre Journal Article
Review
GroupedDBID ---
-E4
.2P
.I3
0R~
23N
2WC
36B
4.4
48X
53G
5GY
5VS
6J9
70D
8VB
AAHBH
AAIJN
AAIMJ
AAJKP
AAMDB
AAMVS
AAOGV
AAPQZ
AAPXW
AARHZ
AAVAP
AAVLN
AAYXX
ABDBF
ABEJV
ABEUO
ABGNP
ABIXL
ABNKS
ABPQP
ABPTD
ABQLI
ABWST
ABXVV
ABXZS
ABZBJ
ACGFO
ACGFS
ACGOD
ACIWK
ACPRK
ACUFI
ACUHS
ACUXJ
ACYTK
ADBBV
ADEYI
ADFTL
ADGKP
ADGZP
ADHKW
ADHZD
ADOCK
ADPDF
ADQBN
ADRDM
ADRTK
ADVEK
ADYVW
ADZTZ
ADZXQ
AECKG
AEGPL
AEGXH
AEJOX
AEKKA
AEKSI
AELWJ
AEMDU
AEMOZ
AENEX
AENZO
AEPUE
AETBJ
AEWNT
AFFZL
AFGWE
AFIYH
AFOFC
AFRAH
AGINJ
AGKEF
AGQXC
AGSYK
AHGBF
AHMBA
AHQJS
AHXPO
AIAGR
AIJHB
AJEEA
AJEUX
AKHUL
AKVCP
AKWXX
ALMA_UNASSIGNED_HOLDINGS
ALTZX
ALUQC
ALXQX
AMNDL
ANAKG
APIBT
APWMN
ARIXL
AXUDD
AYOIW
AZVOD
BAWUL
BAYMD
BEYMZ
BHONS
BQDIO
BQUQU
BSWAC
BTQHN
C45
CDBKE
CITATION
CS3
CZ4
DAKXR
DIK
DILTD
DU5
D~K
E3Z
EAD
EAP
EAS
EBA
EBC
EBD
EBR
EBS
EBU
EE~
EMB
EMK
EMOBN
EST
ESX
F5P
F9B
FHSFR
FLIZI
FLUFQ
FOEOM
FQBLK
GAUVT
GJXCC
GX1
H13
H5~
HAR
HW0
HZ~
IOX
J21
JXSIZ
K1G
KOP
KSI
KSN
M-Z
MK~
ML0
N9A
NGC
NLBLG
NMDNZ
NOMLY
O9-
OAWHX
ODMLO
OJQWA
OK1
OVD
OVEED
P2P
PAFKI
PEELM
PQQKQ
Q1.
Q5Y
QWB
RD5
RPM
RUSNO
RW1
RXO
SV3
TEORI
TH9
TJP
TLC
TOX
TR2
TUS
W8F
WOQ
X7H
YAYTL
YKOAZ
YXANX
ZKX
ZL0
~91
CGR
CUY
CVF
ECM
EIF
GROUPED_DOAJ
M49
NPM
7X8
1TH
AAGQS
AAJQQ
AAUQX
ABBSD
ADTPV
AOWAS
C1A
CAG
COF
D8T
EJD
F1S
KBUDW
NU-
O0~
ZZAVC
ID FETCH-LOGICAL-c402t-1e3f80fe8dfdbbdea75bb5e0a686916ff94f0dd73ce1982a4682273319c128173
ISSN 1467-5463
1477-4054
IngestDate Thu Aug 21 07:17:41 EDT 2025
Fri Jul 11 01:28:54 EDT 2025
Thu Apr 03 07:07:03 EDT 2025
Thu Apr 24 23:02:46 EDT 2025
Tue Jul 01 03:39:43 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 6
Keywords drug–target discovery
knowledge graph embeddings
disease–gene prediction
Language English
License https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model
The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c402t-1e3f80fe8dfdbbdea75bb5e0a686916ff94f0dd73ce1982a4682273319c128173
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-3
content type line 23
ObjectType-Review-1
OpenAccessLink https://research.chalmers.se/publication/532491
PMID 36151740
PQID 2717684383
PQPubID 23479
ParticipantIDs swepub_primary_oai_research_chalmers_se_08ca7f4e_4208_4c59_9b4c_1a844e44774d
proquest_miscellaneous_2717684383
pubmed_primary_36151740
crossref_citationtrail_10_1093_bib_bbac404
crossref_primary_10_1093_bib_bbac404
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2022-11-19
PublicationDateYYYYMMDD 2022-11-19
PublicationDate_xml – month: 11
  year: 2022
  text: 2022-11-19
  day: 19
PublicationDecade 2020
PublicationPlace England
PublicationPlace_xml – name: England
PublicationTitle Briefings in bioinformatics
PublicationTitleAlternate Brief Bioinform
PublicationYear 2022
References Pletscher-Frankild (2022112111112326300_ref66) 2015; 74
Chen (2022112111112326300_ref92) 2002; 30
Szklarczyk (2022112111112326300_ref57) 2019; 47
Rubin (2022112111112326300_ref41) 2008; 9
Chen (2022112111112326300_ref74) 2001; 4
Mikolov (2022112111112326300_ref78) 2013; 26
Brown (2022112111112326300_ref75) 2017; 4
Paliwal (2022112111112326300_ref102) 2020; 10
Nguyen (2022112111112326300_ref51) 2017; 45
Zheng (2022112111112326300_ref80) 2021; 22
Jupp (2022112111112326300_ref9) 2014; 30
Oprea (2022112111112326300_ref34) 2018; 17
Carvalho-Silva (2022112111112326300_ref50) 2019; 47
Tanoli (2022112111112326300_ref23) 2020; 22
Nelson (2022112111112326300_ref87) 2019; 10
Celebi (2022112111112326300_ref29) 2019; 20
Koscielny (2022112111112326300_ref49) 2017; 45
Barabási (2022112111112326300_ref37) 2011; 12
Hornbeck (2022112111112326300_ref100) 2015; 43
Zheng (2022112111112326300_ref89) 2021
Kim (2022112111112326300_ref71) 2016; 44
Himmelstein (2022112111112326300_ref18) 2017; 6
Hamilton (2022112111112326300_ref21) 2020; 14
Wishart (2022112111112326300_ref72) 2008; 36
Chen (2022112111112326300_ref5) 2009; 10
Berrendorf (2022112111112326300_ref36) 2020
Errica (2022112111112326300_ref112) 2019
Choobdar (2022112111112326300_ref38) 2019; 16
Stark (2022112111112326300_ref58) 2006; 34
Consortium GO (2022112111112326300_ref86) 2008; 36
Hirohara (2022112111112326300_ref76) 2018; 19
Schlichtkrull (2022112111112326300_ref106) 2018
Wilkinson (2022112111112326300_ref109) 2016; 3
Lee (2022112111112326300_ref22) 2020; 10
Kanehisa (2022112111112326300_ref98) 2010; 38
Buniello (2022112111112326300_ref69) 2019; 47
Oughtred (2022112111112326300_ref103) 2019; 47
Ursu (2022112111112326300_ref73) 2016; 45
Hwang (2022112111112326300_ref94) 2019; 47
Slenter (2022112111112326300_ref63) 2018; 46
en Schulze-Kremer S (2022112111112326300_ref40) 2001; 6
Davis (2022112111112326300_ref96) 2019; 47
Luo (2022112111112326300_ref24) 2020; 22
Hamosh (2022112111112326300_ref68) 2000; 15
Vamathevan (2022112111112326300_ref3) 2019; 18
Piñero (2022112111112326300_ref85) 2020; 48
Lopez-Del Rio (2022112111112326300_ref35) 2019; 59
Have (2022112111112326300_ref83) 2013; 29
Ali (2022112111112326300_ref114) 2021
Belleau (2022112111112326300_ref30) 2008; 41
Maglott (2022112111112326300_ref55) 2005; 33
Malone (2022112111112326300_ref47) 2010; 26
Köhler (2022112111112326300_ref97) 2019; 47
Hermjakob (2022112111112326300_ref59) 2004; 32
Yates (2022112111112326300_ref53) 2020; 48
Zhu (2022112111112326300_ref31) 2019; 20
Jassal (2022112111112326300_ref62) 2020; 48
Morgan (2022112111112326300_ref1) 2018; 17
Percha (2022112111112326300_ref90) 2018; 34
Durinx (2022112111112326300_ref48) 2016; 5
Wishart (2022112111112326300_ref84) 2018; 46
Dacrema (2022112111112326300_ref111) 2021; 39
Zhang (2022112111112326300_ref99) 2019
Wise (2022112111112326300_ref15) 2020
Ioannidis (2022112111112326300_ref88) 2020
Masoudi-Sobhanzadeh (2022112111112326300_ref26) 2020; 112
Rives (2022112111112326300_ref56) 2021; 118
Trouillon (2022112111112326300_ref107) 2016
Toutanova (2022112111112326300_ref110) 2015
Apweiler (2022112111112326300_ref52) 2004; 32
Mohamed (2022112111112326300_ref101) 2020; 36
Kanehisa (2022112111112326300_ref65) 2007; 36
Terstappen (2022112111112326300_ref2) 2001; 22
Sorger (2022112111112326300_ref39) 2011
Piñero (2022112111112326300_ref67) 2015; 2015
Zhang (2022112111112326300_ref20) 2019
Lee (2022112111112326300_ref95) 2020; 36
Rigden (2022112111112326300_ref17) 2020; 48
Hogan (2022112111112326300_ref8) 2021; 54
Mendez (2022112111112326300_ref70) 2019; 47
Breit (2022112111112326300_ref81) 2020; 36
Lipton (2022112111112326300_ref113) 2019; 17
Zhu (2022112111112326300_ref25) 2020; 26
Gaudelet (2022112111112326300_ref16) 2021; 22
Zitnik (2022112111112326300_ref19) 2018; 34
Kanehisa (2022112111112326300_ref64) 2017; 45
Consortium GO (2022112111112326300_ref46) 2004; 32
Li (2022112111112326300_ref6) 2020; 21
Rintala (2022112111112326300_ref7) 2022
Cernile (2022112111112326300_ref11) 2020; 28
Bagherian (2022112111112326300_ref27) 2020; 22
Robinson (2022112111112326300_ref44) 2008; 83
Kuhn (2022112111112326300_ref93) 2016; 44
Bengio (2022112111112326300_ref77) 2013; 35
Türei (2022112111112326300_ref60) 2016; 13
Reese (2022112111112326300_ref14) 2020; 2
Tatonetti (2022112111112326300_ref105) 2012; 4
Vasilevsky (2022112111112326300_ref42) 2022
Schriml (2022112111112326300_ref45) 2019; 47
Callahan (2022112111112326300_ref33) 2020; 3
Whirl-Carrillo (2022112111112326300_ref91) 2012; 92
Ioannidis (2022112111112326300_ref13) 2020
Szklarczyk (2022112111112326300_ref104) 2016; 44
Santos (2022112111112326300_ref82) 2022; 45
Bettencourt-Silva (2022112111112326300_ref10) 2020; 275
Sweeney (2022112111112326300_ref54) 2019; 47
Mubeen (2022112111112326300_ref61) 2019; 10
Chen (2022112111112326300_ref28) 2018; 23
Mohamed (2022112111112326300_ref32) 2020; 22
Lipscomb (2022112111112326300_ref43) 2000; 88
Walsh (2022112111112326300_ref79) 2020
Deng (2022112111112326300_ref108) 2009
Domingo-Fernandez (2022112111112326300_ref12) 2020; 37
Jumper (2022112111112326300_ref4) 2021; 596
References_xml – year: 2020
  ident: 2022112111112326300_ref36
  article-title: On the Ambiguity of Rank-Based Evaluation of Entity Alignment or Link Prediction Methods
– volume: 37
  start-page: 09
  year: 2020
  ident: 2022112111112326300_ref12
  article-title: COVID-19 Knowledge Graph: a computable, multi-modal, cause-and-effect knowledge model of COVID-19 pathophysiology
  publication-title: Bioinformatics
– volume: 17
  start-page: 167
  issue: 3
  year: 2018
  ident: 2022112111112326300_ref1
  article-title: Impact of a five-dimensional framework on R&D productivity at AstraZeneca
  publication-title: Nat Rev Drug Discov
  doi: 10.1038/nrd.2017.244
– volume: 30
  start-page: 1338
  issue: 9
  year: 2014
  ident: 2022112111112326300_ref9
  article-title: The EBI RDF platform: linked open data for the life sciences
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btt765
– volume: 74
  start-page: 83
  year: 2015
  ident: 2022112111112326300_ref66
  article-title: DISEASES: Text mining and data integration of disease–gene associations
  publication-title: Methods
  doi: 10.1016/j.ymeth.2014.11.020
– volume: 47
  start-page: D573
  issue: D1
  year: 2019
  ident: 2022112111112326300_ref94
  article-title: HumanNet v2: human gene networks for disease research
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gky1126
– volume: 36
  start-page: D901
  issue: suppl_1
  year: 2008
  ident: 2022112111112326300_ref72
  article-title: DrugBank: a knowledgebase for drugs, drug actions and drug targets
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkm958
– year: 2022
  ident: 2022112111112326300_ref42
  article-title: Mondo: Unifying diseases for the world, by the world
  publication-title: medRxiv
– start-page: 06
  year: 2022
  ident: 2022112111112326300_ref7
  article-title: Network approaches for modeling the effect of drugs and diseases
  publication-title: Brief Bioinform
– volume: 32
  start-page: D258
  issue: suppl_1
  year: 2004
  ident: 2022112111112326300_ref46
  article-title: The Gene Ontology (GO) database and informatics resource
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkh036
– volume: 22
  start-page: 23
  issue: 1
  year: 2001
  ident: 2022112111112326300_ref2
  article-title: In silico research in drug discovery
  publication-title: Trends Pharmacol Sci
  doi: 10.1016/S0165-6147(00)01584-4
– volume: 41
  start-page: 706
  issue: 5
  year: 2008
  ident: 2022112111112326300_ref30
  article-title: Bio2RDF: towards a mashup to build bioinformatics knowledge systems
  publication-title: J Biomed Inform
  doi: 10.1016/j.jbi.2008.03.004
– volume: 47
  start-page: D930
  issue: D1
  year: 2019
  ident: 2022112111112326300_ref70
  article-title: ChEMBL: towards direct deposition of bioassay data
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gky1075
– volume: 54
  start-page: 1
  issue: 4
  year: 2021
  ident: 2022112111112326300_ref8
  article-title: Knowledge graphs
  publication-title: ACM Computing Surveys (CSUR)
  doi: 10.1145/3447772
– volume: 36
  year: 2020
  ident: 2022112111112326300_ref81
  article-title: OpenBioLink: A benchmarking framework for large-scale biomedical link prediction
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btaa274
– volume: 47
  start-page: D1018
  issue: D1
  year: 2019
  ident: 2022112111112326300_ref97
  article-title: Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gky1105
– volume: 45
  start-page: D353
  issue: D1
  year: 2017
  ident: 2022112111112326300_ref64
  article-title: KEGG: new perspectives on genomes, pathways, diseases and drugs
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkw1092
– volume: 22
  year: 2020
  ident: 2022112111112326300_ref23
  article-title: Exploration of databases and methods supporting drug repurposing: a comprehensive survey
  publication-title: Brief Bioinform
– volume: 32
  start-page: D115
  issue: suppl_1
  year: 2004
  ident: 2022112111112326300_ref52
  article-title: UniProt: the universal protein knowledgebase
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkh131
– volume-title: An NIH white paper by the QSP workshop group
  year: 2011
  ident: 2022112111112326300_ref39
– volume: 17
  start-page: 317
  issue: 5
  year: 2018
  ident: 2022112111112326300_ref34
  article-title: Unexplored therapeutic opportunities in the human genome
  publication-title: Nat Rev Drug Discov
  doi: 10.1038/nrd.2018.14
– volume: 26
  start-page: 1112
  issue: 8
  year: 2010
  ident: 2022112111112326300_ref47
  article-title: Modeling sample variables with an Experimental Factor Ontology
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btq099
– volume: 29
  start-page: 3107
  issue: 24
  year: 2013
  ident: 2022112111112326300_ref83
  article-title: Are graph databases ready for bioinformatics?
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btt549
– volume: 14
  start-page: 1
  issue: 3
  year: 2020
  ident: 2022112111112326300_ref21
  article-title: Graph representation learning
  publication-title: Synthesis Lectures on Artifical Intelligence and Machine Learning
  doi: 10.1007/978-3-031-01588-5
– year: 2019
  ident: 2022112111112326300_ref112
  article-title: A fair comparison of graph neural networks for graph classification
– volume: 4
  start-page: 719
  issue: 8
  year: 2001
  ident: 2022112111112326300_ref74
  article-title: BindingDB: a web-accessible molecular recognition database
  publication-title: Comb Chem High Throughput Screen
  doi: 10.2174/1386207013330670
– volume: 39
  start-page: 1
  issue: 2
  year: 2021
  ident: 2022112111112326300_ref111
  article-title: A troubling analysis of reproducibility and progress in recommender systems research
  publication-title: ACM Transactions on Information Systems (TOIS)
  doi: 10.1145/3434185
– volume: 22
  start-page: 05
  year: 2021
  ident: 2022112111112326300_ref16
  article-title: Utilizing graph machine learning within drug discovery and development
  publication-title: Brief Bioinform
  doi: 10.1093/bib/bbab159
– volume: 92
  start-page: 414
  issue: 4
  year: 2012
  ident: 2022112111112326300_ref91
  article-title: Pharmacogenomics knowledge for personalized medicine
  publication-title: Clinical Pharmacology & Therapeutics
  doi: 10.1038/clpt.2012.96
– start-page: 793
  volume-title: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
  year: 2019
  ident: 2022112111112326300_ref20
  doi: 10.1145/3292500.3330961
– volume: 17
  start-page: 45
  issue: 1
  year: 2019
  ident: 2022112111112326300_ref113
  article-title: Troubling Trends in Machine Learning Scholarship: Some ML papers suffer from flaws that could mislead the public and stymie future research
  publication-title: Queue
  doi: 10.1145/3317287.3328534
– volume: 47
  start-page: D1056
  issue: D1
  year: 2019
  ident: 2022112111112326300_ref50
  article-title: Open Targets Platform: new developments and updates two years on
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gky1133
– year: 2020
  ident: 2022112111112326300_ref88
  article-title: Few-shot link prediction via graph neural networks for Covid-19 drug-repurposing
– volume: 26
  year: 2020
  ident: 2022112111112326300_ref25
  article-title: Knowledge-driven drug repurposing using a comprehensive drug knowledge graph
  publication-title: Health Informatics J
  doi: 10.1177/1460458220937101
– start-page: 1
  volume-title: Proceedings of Knowledgeable NLP: the First Workshop on Integrating Structured Knowledge and Neural Networks for NLP
  year: 2020
  ident: 2022112111112326300_ref15
– volume: 10
  start-page: 1203
  year: 2019
  ident: 2022112111112326300_ref61
  article-title: The impact of pathway database choice on statistical enrichment analysis and predictive modeling
  publication-title: Front Genet
  doi: 10.3389/fgene.2019.01203
– start-page: 614
  volume-title: 2019 IEEE 35th International Conference on Data Engineering (ICDE)
  year: 2019
  ident: 2022112111112326300_ref99
  doi: 10.1109/ICDE.2019.00061
– volume: 33
  start-page: D54
  issue: suppl_1
  year: 2005
  ident: 2022112111112326300_ref55
  article-title: Entrez Gene: gene-centered information at NCBI
  publication-title: Nucleic Acids Res
– volume: 2
  year: 2020
  ident: 2022112111112326300_ref14
  article-title: KG-COVID-19: a framework to produce customized knowledge graphs for COVID-19 response
  publication-title: Patterns
– volume: 48
  start-page: D1
  issue: D1
  year: 2020
  ident: 2022112111112326300_ref17
  article-title: The 27th annual Nucleic Acids Research database issue and molecular biology database collection
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkz1161
– volume: 12
  start-page: 56
  issue: 1
  year: 2011
  ident: 2022112111112326300_ref37
  article-title: Network medicine: a network-based approach to human disease
  publication-title: Nat Rev Genet
  doi: 10.1038/nrg2918
– start-page: 1141
  volume-title: Proceedings of the 14th ACM International Conference on Web Search and Data Mining
  year: 2021
  ident: 2022112111112326300_ref89
  doi: 10.1145/3437963.3441663
– volume: 47
  start-page: D529
  issue: D1
  year: 2019
  ident: 2022112111112326300_ref103
  article-title: The BioGRID interaction database: 2019 update
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gky1079
– volume: 30
  start-page: 412
  issue: 1
  year: 2002
  ident: 2022112111112326300_ref92
  article-title: TTD: therapeutic target database
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/30.1.412
– volume: 48
  start-page: D845
  issue: D1
  year: 2020
  ident: 2022112111112326300_ref85
  article-title: The DisGeNET knowledge platform for disease genomics: 2019 update
  publication-title: Nucleic Acids Res
– volume: 43
  start-page: D512
  issue: D1
  year: 2015
  ident: 2022112111112326300_ref100
  article-title: PhosphoSitePlus, 2014: Mutations, PTMs and recalibrations
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gku1267
– volume: 83
  start-page: 610
  issue: 5
  year: 2008
  ident: 2022112111112326300_ref44
  article-title: The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease
  publication-title: The American Journal of Human Genetics
  doi: 10.1016/j.ajhg.2008.09.017
– volume: 59
  start-page: 1645
  issue: 4
  year: 2019
  ident: 2022112111112326300_ref35
  article-title: Evaluation of Cross-Validation Strategies in Sequence-Based Binding Prediction Using Deep Learning
  publication-title: J Chem Inf Model
  doi: 10.1021/acs.jcim.8b00663
– start-page: 593
  volume-title: European Semantic Web Conference
  year: 2018
  ident: 2022112111112326300_ref106
  doi: 10.1007/978-3-319-93417-4_38
– volume: 28
  issue: 1
  year: 2020
  ident: 2022112111112326300_ref11
  article-title: Network graph representation of COVID-19 scientific publications to aid knowledge discovery
  publication-title: BMJ Health & Care Informatics
  doi: 10.1136/bmjhci-2020-100254
– volume: 16
  start-page: 843
  issue: 9
  year: 2019
  ident: 2022112111112326300_ref38
  article-title: Assessment of network module identification across complex diseases
  publication-title: Nat Methods
  doi: 10.1038/s41592-019-0509-5
– volume: 10
  start-page: 177
  issue: 2
  year: 2009
  ident: 2022112111112326300_ref5
  article-title: Semantic web for integrated network analysis in biomedicine
  publication-title: Brief Bioinform
  doi: 10.1093/bib/bbp002
– volume: 20
  start-page: 1308
  issue: 4
  year: 2019
  ident: 2022112111112326300_ref31
  article-title: Drug knowledge bases and their applications in biomedical informatics research
  publication-title: Brief Bioinform
  doi: 10.1093/bib/bbx169
– volume: 2015
  year: 2015
  ident: 2022112111112326300_ref67
  article-title: DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes
  publication-title: Database
  doi: 10.1093/database/bav028
– volume-title: International Conference on Machine Learning (ICML)
  year: 2016
  ident: 2022112111112326300_ref107
– volume: 44
  start-page: D1075
  issue: D1
  year: 2016
  ident: 2022112111112326300_ref93
  article-title: The SIDER database of drugs and side effects
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkv1075
– start-page: 57
  volume-title: Proceedings of the 3rd workshop on continuous vector space models and their compositionality
  year: 2015
  ident: 2022112111112326300_ref110
  doi: 10.18653/v1/W15-4007
– volume: 4
  start-page: 1
  issue: 1
  year: 2017
  ident: 2022112111112326300_ref75
  article-title: A standard database for drug repositioning
  publication-title: Scientific data
  doi: 10.1038/sdata.2017.29
– year: 2020
  ident: 2022112111112326300_ref13
  article-title: DRKG - Drug Repurposing Knowledge Graph for Covid-19
– volume: 88
  start-page: 265
  issue: 3
  year: 2000
  ident: 2022112111112326300_ref43
  article-title: Medical subject headings (MeSH)
  publication-title: Bull Med Libr Assoc
– volume: 275
  start-page: 6
  year: 2020
  ident: 2022112111112326300_ref10
  article-title: Exploring the Social Drivers of Health During a Pandemic: Leveraging Knowledge Graphs and Population Trends in COVID-19
  publication-title: Stud Health Technol Inform
– volume: 6
  year: 2017
  ident: 2022112111112326300_ref18
  article-title: Systematic integration of biomedical knowledge prioritizes drugs for repurposing
  publication-title: Elife
  doi: 10.7554/eLife.26726
– volume: 45
  start-page: 1
  year: 2022
  ident: 2022112111112326300_ref82
  article-title: A knowledge graph to interpret clinical proteomics data
  publication-title: Nat Biotechnol
– volume: 38
  start-page: D355
  issue: suppl_1
  year: 2010
  ident: 2022112111112326300_ref98
  article-title: KEGG for representation and analysis of molecular networks involving diseases and drugs
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkp896
– volume: 15
  start-page: 57
  issue: 1
  year: 2000
  ident: 2022112111112326300_ref68
  article-title: Online Mendelian inheritance in man (OMIM)
  publication-title: Hum Mutat
  doi: 10.1002/(SICI)1098-1004(200001)15:1<57::AID-HUMU12>3.0.CO;2-G
– volume: 26
  start-page: 3111
  year: 2013
  ident: 2022112111112326300_ref78
  article-title: Distributed Representations of Words and Phrases and their Compositionality
  publication-title: Advances in Neural Information Processing Systems
– volume: 44
  start-page: D380
  issue: D1
  year: 2016
  ident: 2022112111112326300_ref104
  article-title: STITCH 5: augmenting protein–chemical interaction networks with tissue and affinity data
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkv1277
– volume: 3
  year: 2020
  ident: 2022112111112326300_ref33
  article-title: Knowledge-Based Biomedical Data Science. Annual Review of Biomedical Data
  publication-title: Science
– volume: 46
  start-page: D661
  issue: D1
  year: 2018
  ident: 2022112111112326300_ref63
  article-title: WikiPathways: a multifaceted pathway database bridging metabolomics to other omics research
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkx1064
– volume: 36
  start-page: D480
  issue: suppl_1
  year: 2007
  ident: 2022112111112326300_ref65
  article-title: KEGG for linking genomes to life and the environment
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkm882
– volume: 112
  start-page: 1087
  issue: 2
  year: 2020
  ident: 2022112111112326300_ref26
  article-title: Drug databases and their contributions to drug repurposing
  publication-title: Genomics
  doi: 10.1016/j.ygeno.2019.06.021
– volume: 44
  start-page: D1202
  issue: D1
  year: 2016
  ident: 2022112111112326300_ref71
  article-title: PubChem substance and compound databases
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkv951
– volume: 10
  start-page: 1
  issue: 1
  year: 2019
  ident: 2022112111112326300_ref87
  article-title: Integrating biomedical research and electronic health records to create knowledge-based biologically meaningful machine-readable embeddings
  publication-title: Nat Commun
  doi: 10.1038/s41467-019-11069-0
– volume: 10
  start-page: 1381
  year: 2020
  ident: 2022112111112326300_ref22
  article-title: Heterogeneous Multi-Layered Network Model for Omics Data Integration and Analysis
  publication-title: Front Genet
  doi: 10.3389/fgene.2019.01381
– volume: 45
  start-page: D985
  issue: D1
  year: 2017
  ident: 2022112111112326300_ref49
  article-title: Open Targets: a platform for therapeutic target identification and validation
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkw1055
– volume: 34
  start-page: D535
  issue: suppl_1
  year: 2006
  ident: 2022112111112326300_ref58
  article-title: BioGRID: a general repository for interaction datasets
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkj109
– volume: 45
  start-page: gkw993
  year: 2016
  ident: 2022112111112326300_ref73
  article-title: DrugCentral: online drug compendium
  publication-title: Nucleic Acids Res
– volume: 47
  start-page: D607
  issue: D1
  year: 2019
  ident: 2022112111112326300_ref57
  article-title: STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gky1131
– volume: 36
  start-page: 1234
  issue: 4
  year: 2020
  ident: 2022112111112326300_ref95
  article-title: BioBERT: a pre-trained biomedical language representation model for biomedical text mining
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btz682
– year: 2021
  ident: 2022112111112326300_ref114
  article-title: Bringing light into the dark: A large-scale evaluation of knowledge graph embedding models under a unified framework
  publication-title: IEEE Trans Pattern Anal Mach Intell
– volume: 22
  year: 2021
  ident: 2022112111112326300_ref80
  article-title: PharmKG: a dedicated knowledge graph benchmark for bomedical data mining
  publication-title: Brief Bioinform
  doi: 10.1093/bib/bbaa344
– volume: 118
  issue: 15
  year: 2021
  ident: 2022112111112326300_ref56
  article-title: Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences
  publication-title: Proc Natl Acad Sci
  doi: 10.1073/pnas.2016239118
– volume: 47
  start-page: D948
  issue: D1
  year: 2019
  ident: 2022112111112326300_ref96
  article-title: The comparative toxicogenomics database: update 2019
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gky868
– start-page: 3173
  volume-title: Proceedings of the 29th ACM International Conference on Information & Knowledge Management
  year: 2020
  ident: 2022112111112326300_ref79
  doi: 10.1145/3340531.3412776
– volume: 10
  start-page: 1
  issue: 1
  year: 2020
  ident: 2022112111112326300_ref102
  article-title: Preclinical validation of therapeutic targets predicted by tensor factorization on heterogeneous graphs
  publication-title: Sci Rep
  doi: 10.1038/s41598-020-74922-z
– volume: 34
  start-page: 2614
  issue: 15
  year: 2018
  ident: 2022112111112326300_ref90
  article-title: A global network of biomedical relationships derived from text
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bty114
– volume: 22
  year: 2020
  ident: 2022112111112326300_ref24
  article-title: Biomedical data and computational models for drug repositioning: a comprehensive review
  publication-title: Brief Bioinform
– volume: 13
  start-page: 966
  issue: 12
  year: 2016
  ident: 2022112111112326300_ref60
  article-title: OmniPath: guidelines and gateway for literature-curated signaling pathway resources
  publication-title: Nat Methods
  doi: 10.1038/nmeth.4077
– volume: 36
  start-page: 603
  issue: 2
  year: 2020
  ident: 2022112111112326300_ref101
  article-title: Discovering protein drug targets using knowledge graph embeddings
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btz600
– volume: 4
  start-page: 125ra31
  issue: 125
  year: 2012
  ident: 2022112111112326300_ref105
  article-title: Data-driven prediction of drug effects and interactions
  publication-title: Sci Transl Med
  doi: 10.1126/scitranslmed.3003377
– volume: 47
  start-page: D1250
  issue: D1
  year: 2019
  ident: 2022112111112326300_ref54
  article-title: RNAcentral: a hub of information for non-coding RNA sequences
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gky1206
– volume: 46
  start-page: D1074
  issue: D1
  year: 2018
  ident: 2022112111112326300_ref84
  article-title: DrugBank 5.0: a major update to the DrugBank database for 2018
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkx1037
– volume: 48
  start-page: D498
  issue: D1
  year: 2020
  ident: 2022112111112326300_ref62
  article-title: The reactome pathway knowledgebase
  publication-title: Nucleic Acids Res
– volume: 596
  start-page: 583
  issue: 7873
  year: 2021
  ident: 2022112111112326300_ref4
  article-title: Highly accurate protein structure prediction with AlphaFold
  publication-title: Nature
  doi: 10.1038/s41586-021-03819-2
– volume: 47
  start-page: D955
  issue: D1
  year: 2019
  ident: 2022112111112326300_ref45
  article-title: Human Disease Ontology 2018 update: classification, content and workflow expansion
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gky1032
– volume: 3
  start-page: 1
  issue: 1
  year: 2016
  ident: 2022112111112326300_ref109
  article-title: The FAIR Guiding Principles for scientific data management and stewardship
  publication-title: Scientific data
  doi: 10.1038/sdata.2016.18
– volume: 47
  start-page: D1005
  issue: D1
  year: 2019
  ident: 2022112111112326300_ref69
  article-title: The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gky1120
– volume: 21
  start-page: 566
  issue: 2
  year: 2020
  ident: 2022112111112326300_ref6
  article-title: Network-based methods for predicting essential genes or proteins: a survey
  publication-title: Brief Bioinform
  doi: 10.1093/bib/bbz017
– volume: 48
  start-page: D682
  issue: D1
  year: 2020
  ident: 2022112111112326300_ref53
  article-title: Ensembl 2020
  publication-title: Nucleic Acids Res
– start-page: 248
  volume-title: 2009 IEEE conference on computer vision and pattern recognition
  year: 2009
  ident: 2022112111112326300_ref108
  doi: 10.1109/CVPR.2009.5206848
– volume: 20
  start-page: 1
  issue: 1
  year: 2019
  ident: 2022112111112326300_ref29
  article-title: Evaluation of knowledge graph embedding approaches for drug-drug interaction prediction in realistic settings
  publication-title: BMC bioinformatics
  doi: 10.1186/s12859-019-3284-5
– volume: 36
  start-page: D440
  issue: suppl_1
  year: 2008
  ident: 2022112111112326300_ref86
  article-title: The gene ontology project in 2008
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkm883
– volume: 22
  year: 2020
  ident: 2022112111112326300_ref32
  article-title: Biological applications of knowledge graph embedding models
  publication-title: Brief Bioinform
– volume: 5
  year: 2016
  ident: 2022112111112326300_ref48
  article-title: Identifying ELIXIR core data resources
  publication-title: F1000Research
  doi: 10.12688/f1000research.9656.1
– volume: 19
  start-page: 83
  issue: 19
  year: 2018
  ident: 2022112111112326300_ref76
  article-title: Convolutional neural network based on SMILES representation of compounds for detecting chemical motif
  publication-title: BMC bioinformatics
– volume: 22
  year: 2020
  ident: 2022112111112326300_ref27
  article-title: Machine learning approaches and databases for prediction of drug–target interaction: a survey paper
  publication-title: Brief Bioinform
– volume: 45
  start-page: D995
  issue: D1
  year: 2017
  ident: 2022112111112326300_ref51
  article-title: Pharos: Collating protein information to shed light on the druggable genome
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkw1072
– volume: 23
  start-page: 2208
  issue: 9
  year: 2018
  ident: 2022112111112326300_ref28
  article-title: Machine learning for drug-target interaction prediction
  publication-title: Molecules
  doi: 10.3390/molecules23092208
– volume: 18
  start-page: 463
  issue: 6
  year: 2019
  ident: 2022112111112326300_ref3
  article-title: Applications of machine learning in drug discovery and development
  publication-title: Nat Rev Drug Discov
  doi: 10.1038/s41573-019-0024-5
– volume: 32
  start-page: D452
  issue: suppl_1
  year: 2004
  ident: 2022112111112326300_ref59
  article-title: IntAct: an open source molecular interaction database
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkh052
– volume: 35
  start-page: 1798
  issue: 8
  year: 2013
  ident: 2022112111112326300_ref77
  article-title: Representation learning: A review and new perspectives
  publication-title: IEEE Trans Pattern Anal Mach Intell
  doi: 10.1109/TPAMI.2013.50
– volume: 9
  start-page: 75
  issue: 1
  year: 2008
  ident: 2022112111112326300_ref41
  article-title: Biomedical ontologies: a functional perspective
  publication-title: Brief Bioinform
  doi: 10.1093/bib/bbm059
– volume: 6
  issue: 21
  year: 2001
  ident: 2022112111112326300_ref40
  article-title: Ontologies for molecular biology. Computer and Information
  publication-title: Science
– volume: 34
  start-page: i457
  issue: 13
  year: 2018
  ident: 2022112111112326300_ref19
  article-title: Modeling polypharmacy side effects with graph convolutional networks
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bty294
SSID ssj0020781
Score 2.5860498
SecondaryResourceType review_article
Snippet Drug discovery and development is a complex and costly process. Machine learning approaches are being investigated to help improve the effectiveness and speed...
SourceID swepub
proquest
pubmed
crossref
SourceType Open Access Repository
Aggregation Database
Index Database
Enrichment Source
SubjectTerms disease-gene prediction
Drug Discovery
drug-target discovery
Information Storage and Retrieval
Knowledge
knowledge graph embeddings
Machine Learning
Pattern Recognition, Automated
Title A review of biomedical datasets relating to drug discovery: a knowledge graph perspective
URI https://www.ncbi.nlm.nih.gov/pubmed/36151740
https://www.proquest.com/docview/2717684383
https://research.chalmers.se/publication/532491
Volume 23
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3db9MwELdgCIkXxDflS0baE1O2pHESh7eCNg1UNgm1Unmy_LlNQglaU03jr-cudj6APgxerMq13NT3y9nnu_sdIbvwgiVc5kkkizKLmFYyUiwpIxfrPLdSMs4wUfjLSX68ZJ9X2WooxtlmlzRqX__cmlfyP1KFPpArZsn-g2T7SaEDPoN8oQUJQ3sjGc9GmSc-j947XWQDe1PrEMBIN8yHqvfM5eYM3TEaYzavfY5zf6G21_JWI4fxOPNyRHxkXVve86LC3wlkq80oUP5D3Zfw8mFjww1pHwr8CfXtcEfrnf027Jx4yXPVleX-Wl8F0Ib7CDBlMSauHKlQVhRglXpq6H27pS_oXZ9nHPCVb1XnnupKXShsldQsZsO-1fnqT07F0XI-F4vD1eI2uTMFewEV3uJ01VveyGjUppmFxwiJmjD9AUx-EKb-_Wjyl73xB5lsewBZPCD3g-VAZx4GD8ktWz0id30t0evH5NuMejDQ2tEBDLQDA-3AQJuaIhhoD4b3VNIeCrSFAh1B4QlZHh0uPh5HoW5GBP9i2kSJTR2PneXGGaWMlUWmVGZjmfMcrAHnSuZiY4pU26TkU8lyOCVi7c5So1-1SJ-Snaqu7HNCleEwFqxoxhRTKdi_SaGcsZnkhhtZTMi7bsmEDqTyWNvku_DBDamA9RVhfSdktx_8w3OpbB_2tlt7AboOHViysvVmLaZFgn7jlKcT8swLpZ8oxaN5weIJmXsp9d8ggXpgzjoX-rwtS7QWaytirmXhmBUYYiKYzkpRKqZFIjljlgFYmHlxg4d5Se4NL8IrstNcbuxrOKo26k0Lw19cbpon
linkProvider Oxford University Press
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+review+of+biomedical+datasets+relating+to+drug+discovery%3A+a+knowledge+graph+perspective&rft.jtitle=Briefings+in+bioinformatics&rft.au=Bonner%2C+Stephen&rft.au=Barrett%2C+Ian+P&rft.au=Ye%2C+Cheng&rft.au=Swiers%2C+Rowan&rft.date=2022-11-19&rft.issn=1477-4054&rft.eissn=1477-4054&rft.volume=23&rft.issue=6&rft_id=info:doi/10.1093%2Fbib%2Fbbac404&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1467-5463&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1467-5463&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1467-5463&client=summon