Combining biomarkers linearly and nonlinearly for classification using the area under the ROC curve
In biomedical studies, it is often of interest to classify/predict a subject's disease status based on a variety of biomarker measurements. A commonly used classification criterion is based on area under the receiver operating characteristic curve (AUC). Many methods have been proposed to optim...
Saved in:
Published in | Statistics in medicine Vol. 35; no. 21; pp. 3792 - 3809 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
England
Blackwell Publishing Ltd
20.09.2016
Wiley Subscription Services, Inc |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In biomedical studies, it is often of interest to classify/predict a subject's disease status based on a variety of biomarker measurements. A commonly used classification criterion is based on area under the receiver operating characteristic curve (AUC). Many methods have been proposed to optimize approximated empirical AUC criteria, but there are two limitations to the existing methods. First, most methods are only designed to find the best linear combination of biomarkers, which may not perform well when there is strong nonlinearity in the data. Second, many existing linear combination methods use gradient‐based algorithms to find the best marker combination, which often result in suboptimal local solutions. In this paper, we address these two problems by proposing a new kernel‐based AUC optimization method called ramp AUC (RAUC). This method approximates the empirical AUC loss function with a ramp function and finds the best combination by a difference of convex functions algorithm. We show that as a linear combination method, RAUC leads to a consistent and asymptotically normal estimator of the linear marker combination when the data are generated from a semiparametric generalized linear model, just as the smoothed AUC method. Through simulation studies and real data examples, we demonstrate that RAUC outperforms smooth AUC in finding the best linear marker combinations, and can successfully capture nonlinear pattern in the data to achieve better classification performance. We illustrate our method with a dataset from a recent HIV vaccine trial. Copyright © 2016 John Wiley & Sons, Ltd. |
---|---|
Bibliography: | ArticleID:SIM6956 Henry M. Jackson Foundation ark:/67375/WNG-86S0KR5R-L cooperative agreement - No. W81XWH-07-2-0067 Supporting info item National Institute of Allergy and Infectious Diseases - No. 1R56AI116369-01A1; No. R01-GM106177; No. UM1-AI-068635 Department of Defense istex:89A7C7E2FE3732A48AAE88F5D2AF5CAE932B98F5 Both authors contributed equally. SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 Equal contribution |
ISSN: | 0277-6715 1097-0258 |
DOI: | 10.1002/sim.6956 |