Prediction models of human plasma protein binding rate and oral bioavailability derived by using GA–CG–SVM method
In this study, support vector machine (SVM) method combined with genetic algorithm (GA) for feature selection and conjugate gradient (CG) method for parameter optimization (GA–CG–SVM), has been employed to develop prediction models of human plasma protein binding rate (PPBR) and oral bioavailability...
Saved in:
Published in | Journal of pharmaceutical and biomedical analysis Vol. 47; no. 4; pp. 677 - 682 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
Amsterdam
Elsevier B.V
05.08.2008
Elsevier Science |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In this study, support vector machine (SVM) method combined with genetic algorithm (GA) for feature selection and conjugate gradient (CG) method for parameter optimization (GA–CG–SVM), has been employed to develop prediction models of human plasma protein binding rate (PPBR) and oral bioavailability (BIO). The advantage of the GA–CG–SVM is that it can deal with feature selection and SVM parameter optimization simultaneously. Five-fold cross-validation as well as independent test set method were used to validate the prediction models. For the PPBR, a total of 692 compounds were used to train and test the prediction model. The prediction accuracy by means of 5-fold cross-validation is 86% and that for the independent test set (161 compounds) is 81%. These accuracies are markedly higher over that of the best model currently available in literature. The number of descriptors selected is 29. For the BIO, the training set is composed of 690 compounds and external 76 compounds form an independent validation set. The prediction accuracy for the training set by using 5-fold cross-validation and that for the independent test set are 80% and 86%, respectively, which are better than or comparable to those of other classification models in literature. The number of descriptors selected is 25. For both the PPBR and BIO, the descriptors selected by GA–CG method cover a large range of molecular properties which imply that the PPBR and BIO of a drug might be affected by many complicated factors. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0731-7085 1873-264X |
DOI: | 10.1016/j.jpba.2008.03.023 |