Automated prediction of HIV drug resistance from genotype data

HIV/AIDS is a serious threat to public health. The emergence of drug resistance mutations diminishes the effectiveness of drug therapy for HIV/AIDS. Developing a computational prediction of drug resistance phenotype will enable efficient and timely selection of the best treatment regimens. A unified...

Full description

Saved in:
Bibliographic Details
Published inBMC bioinformatics Vol. 17 Suppl 8; no. Suppl 8; p. 278
Main Authors Shen, ChenHsiang, Yu, Xiaxia, Harrison, Robert W, Weber, Irene T
Format Journal Article
LanguageEnglish
Published England BioMed Central Ltd 31.08.2016
BioMed Central
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:HIV/AIDS is a serious threat to public health. The emergence of drug resistance mutations diminishes the effectiveness of drug therapy for HIV/AIDS. Developing a computational prediction of drug resistance phenotype will enable efficient and timely selection of the best treatment regimens. A unified encoding of protein sequence and structure was used as the feature vector for predicting phenotypic resistance from genotype data. Two machine learning algorithms, Random Forest and K-nearest neighbor, were used. The prediction accuracies were examined by five-fold cross-validation on the genotype-phenotype datasets. A supervised machine learning approach for automatic prediction of drug resistance was developed to handle genotype-phenotype datasets of HIV protease (PR) and reverse transcriptase (RT). It predicts the drug resistance phenotype and its relative severity from a query sequence. The accuracy of the classification was higher than 0.973 for eight PR inhibitors and 0.986 for ten RT inhibitors, respectively. The overall cross-validated regression R(2)-values for the severity of drug resistance were 0.772-0.953 for 8 PR inhibitors and 0.773-0.995 for 10 RT inhibitors. Machine learning using a unified encoding of sequence and protein structure as a feature vector provides an accurate prediction of drug resistance from genotype data. A practical webserver for clinicians has been implemented.
ISSN:1471-2105
1471-2105
DOI:10.1186/s12859-016-1114-6