Semi-Supervised Hierarchical Drug Embedding in Hyperbolic Space
Learning accurate drug representation is essential for tasks such as computational drug repositioning and prediction of drug side-effects. A drug hierarchy is a valuable source that encodes human knowledge of drug relations in a tree-like structure where drugs that act on the same organs, treat the...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
01.06.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Learning accurate drug representation is essential for tasks such as
computational drug repositioning and prediction of drug side-effects. A drug
hierarchy is a valuable source that encodes human knowledge of drug relations
in a tree-like structure where drugs that act on the same organs, treat the
same disease, or bind to the same biological target are grouped together.
However, its utility in learning drug representations has not yet been
explored, and currently described drug representations cannot place novel
molecules in a drug hierarchy. Here, we develop a semi-supervised drug
embedding that incorporates two sources of information: (1) underlying chemical
grammar that is inferred from molecular structures of drugs and drug-like
molecules (unsupervised), and (2) hierarchical relations that are encoded in an
expert-crafted hierarchy of approved drugs (supervised). We use the Variational
Auto-Encoder (VAE) framework to encode the chemical structures of molecules and
use the knowledge-based drug-drug similarity to induce the clustering of drugs
in hyperbolic space. The hyperbolic space is amenable for encoding hierarchical
concepts. Both quantitative and qualitative results support that the learned
drug embedding can accurately reproduce the chemical structure and induce the
hierarchical relations among drugs. Furthermore, our approach can infer the
pharmacological properties of novel molecules by retrieving similar drugs from
the embedding space. We demonstrate that the learned drug embedding can be used
to find new uses for existing drugs and to discover side-effects. We show that
it significantly outperforms baselines in both tasks. |
---|---|
DOI: | 10.48550/arxiv.2006.00986 |