Estimation of pKa for Druglike Compounds Using Semiempirical and Information-Based Descriptors
A pragmatic approach has been developed for the estimation of aqueous ionization constants (pKa) for druglike compounds. The method involves an algorithm that assigns ionization constants in a stepwise manner to the acidic and basic groups present in a compound. Predictions are made for each ionizab...
Saved in:
Published in | Journal of chemical information and modeling Vol. 47; no. 2; pp. 450 - 459 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
United States
American Chemical Society
01.03.2007
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | A pragmatic approach has been developed for the estimation of aqueous ionization constants (pKa) for druglike compounds. The method involves an algorithm that assigns ionization constants in a stepwise manner to the acidic and basic groups present in a compound. Predictions are made for each ionizable group using models derived from semiempirical quantum chemical properties and information-based descriptors. Semiempirical properties include the partial charge and electrophilic superdelocalizabilty of the atom(s) undergoing protonation or deprotonation. Importantly, the latter property has been extended to allow predictions to be made for multiprotic compounds, overcoming limitations of a previous approach described by Tehan et al. The information-based descriptions include molecular-tree structured fingerprints, based on the methodology outlined by Xing et al., with the addition of 2D substructure flags indicating the presence of other important structural features. These two classes of descriptor were found to complement one another particularly well, resulting in predictive models for a range of functional groups (including alcohols, amidines, amines, anilines, carboxylic acids, guanidines, imidazoles, imines, phenols, pyridines, and pyrimidines). A combined RMSE of 0.48 and 0.81 was obtained for the training set and an external test set compounds, respectively. The predictive models were based on compounds selected from the commercially available BioLoom database. The resultant speed and accuracy of the approach has also enabled the development of Web application on the Novartis intranet for pKa prediction. |
---|---|
Bibliography: | istex:F1767F6377880435E10F1DC4590BA36EF3DD9A67 ark:/67375/TPS-Z327L8FV-5 |
ISSN: | 1549-9596 1549-960X |
DOI: | 10.1021/ci600285n |