Inferring phenotypes from substance use via collaborative matrix completion
Background Although substance use disorders (SUDs) are heritable, few genetic risk factors for them have been identified, in part due to the small sample sizes of study populations. To address this limitation, researchers have aggregated subjects from multiple existing genetic studies, but these sub...
Saved in:
Published in | BMC systems biology Vol. 12; no. Suppl 6; p. 104 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
London
BioMed Central
22.11.2018
BioMed Central Ltd |
Subjects | |
Online Access | Get full text |
ISSN | 1752-0509 1752-0509 |
DOI | 10.1186/s12918-018-0623-5 |
Cover
Loading…
Summary: | Background
Although substance use disorders (SUDs) are heritable, few genetic risk factors for them have been identified, in part due to the small sample sizes of study populations. To address this limitation, researchers have aggregated subjects from multiple existing genetic studies, but these subjects can have missing phenotypic information, including diagnostic criteria for certain substances that were not originally a focus of study. Recent advances in addiction neurobiology have shown that comorbid SUDs (e.g., the abuse of multiple substances) have similar genetic determinants, which makes it possible to infer missing SUD diagnostic criteria using criteria from another SUD and patient genotypes through statistical modeling.
Results
We propose a new approach based on matrix completion techniques to integrate features of comorbid health conditions and individual’s genotypes to infer unreported diagnostic criteria for a disorder. This approach optimizes a bi-linear model that uses the interactions between known disease correlations and candidate genes to impute missing criteria. An efficient stochastic and parallel algorithm was developed to optimize the model with a speed 20 times greater than the classic sequential algorithm. It was tested on 3441 subjects who had both cocaine and opioid use disorders and successfully inferred missing diagnostic criteria with consistently better accuracy than other recent statistical methods.
Conclusions
The proposed matrix completion imputation method is a promising tool to impute unreported or unobserved symptoms or criteria for disease diagnosis. Integrating data at multiple scales or from heterogeneous sources may help improve the accuracy of phenotype imputation. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
ISSN: | 1752-0509 1752-0509 |
DOI: | 10.1186/s12918-018-0623-5 |