Machine Learning Prediction of Progression in Forced Expiratory Volume in 1 Second in the COPDGene® Study
The heterogeneous nature of COPD complicates the identification of the predictors of disease progression. We aimed to improve the prediction of disease progression in COPD by using machine learning and incorporating a rich dataset of phenotypic features. We included 4,496 smokers with available data...
Saved in:
Published in | Chronic obstructive pulmonary diseases Vol. 9; no. 3; pp. 349 - 365 |
---|---|
Main Authors | , , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
United States
COPD Foundation Inc
29.07.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The heterogeneous nature of COPD complicates the identification of the predictors of disease progression. We aimed to improve the prediction of disease progression in COPD by using machine learning and incorporating a rich dataset of phenotypic features.
We included 4,496 smokers with available data from their enrollment and 5-year follow-up visits in the Genetic Epidemiology of COPD (COPDGene) study. We constructed linear regression (LR) and supervised random forest (RF) models to predict 5-year progression in FEV
from 46 baseline features. Using cross-validation, we randomly partitioned participants into training and testing samples. We also validated the results in the COPDGene 10-year follow-up visit.
Predicting the change in FEV
over time is more challenging than simply predicting the future absolute FEV
level. For RF, R-squared was 0.15 and the area under the ROC curves for the prediction of subjects in the top quartile of observed progression was 0.71 (testing) and respectively, 0.10 and 0.70 (validation). RF provided slightly better performance than LR. The accuracy was best for GOLD1-2 subjects and it was harder to achieve accurate prediction in advanced stages of the disease. Predictive variables differed in their relative importance as well as for the predictions by GOLD.
RF along with deep phenotyping predicts FEV
progression with reasonable accuracy. There is significant room for improvement in future models. This prediction model facilitates the identification of smokers at increased risk for rapid disease progression. Such findings may be useful in the selection of patient populations for targeted clinical trials. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Brigham and Women’s Hospital, Boston, MA: Dawn L. DeMeo, MD, MPH; Craig Hersh, MD, MPH; Francine L. Jacobson, MD, MPH; George Washko, MD Data Coordinating Center and Biostatistics, National Jewish Health, Denver, Colorado: Douglas Everett, PhD; Jim Crooks, PhD; Katherine Pratte, PhD; Matt Strand, PhD; Carla G. Wilson, MS Lundquist Institute for Biomedical Innovation at Harbor UCLA Medical Center, Torrance, California: Richard Casaburi, PhD, MD; Alessandra Adami, PhD; Matthew Budoff, MD; Hans Fischer, MD; Janos Porszasz, MD, PhD; Harry Rossiter, PhD; William Stringer, MD University of California, San Diego, California: Douglas Conrad, MD; Xavier Soler, MD, PhD; Andrew Yen, MD Temple University, Philadelphia, Pennsylvania: Gerard Criner, MD; David Ciccolella, MD; Francis Cordova, MD; Chandra Dass, MD; Gilbert D’Alonzo, DO; Parag Desai, MD; Michael Jacobs, PharmD; Steven Kelsen, MD, PhD; Victor Kim, MD; A. James Mamary, MD; Nathaniel Marchetti, DO; Aditi Satti, MD; Kartik Shenoy, MD; Robert M. Steiner, MD; Alex Swift, MD; Irene Swift, MD; Maria Elena Vega-Sanchez, MD Pulmonary Function Testing Quality Assurance Center, Salt Lake City, Utah: Robert Jensen, PhD Johns Hopkins University, Baltimore, Maryland: Robert Wise, MD; Robert Brown, MD; Nadia N. Hansel, MD, MPH; Karen Horton, MD; Allison Lambert, MD, MHS; Nirupama Putcha, MD, MHS University of Minnesota, Minneapolis, Minnesota: Joanne Billings, MD; Abbie Begnaud, MD; Tadashi Allen, MD Morehouse School of Medicine, Atlanta, Georgia: Eric L. Flenaugh, MD; Hirut Gebrekristos, PhD; Mario Ponce, MD; Silanath Terpenning, MD; Gloria Westney, MD, MS University of Texas Health, San Antonio, San Antonio, Texas: Antonio Anzueto, MD; Sandra Adams, MD; Diego Maselli-Caceres, MD; Mario E. Ruiz, MD; Harjinder Singh National Jewish Health, Denver, Colorado: Russell Bowler, MD, PhD; David A. Lynch, MB Dr. Castaldi reports grants from the National Institutes of Health during the conduct of the study, grants and other from GlaxoSmithKline, and personal fees from Novartis, outside the submitted work. Dr. Silverman reports grants from the National Institutes of Health during the conduct of the study, and grants and other from GlaxoSmithKline, outside the submitted work. All other authors have nothing to disclose. Ann Arbor VA, Ann Arbor, Michigan: Jeffrey L. Curtis, MD; Perry G. Pernicano, MD The authors wish to thank the thousands of patients who participated in the COPDGene study over the last 10 years. Imaging Center: Juan Pablo Centeno; Jean-Paul Charbonnier, PhD; Harvey O. Coxson, PhD; Craig J. Galban, PhD; MeiLan K. Han, MD, MS; Eric A. Hoffman, Stephen Humphries, PhD; Francine L. Jacobson, MD, MPH; Philip F. Judy, PhD; Ella A. Kazerooni, MD; Alex Kluiber; David A. Lynch, MB; Pietro Nardelli, PhD; John D. Newell, Jr., MD; Aleena Notary; Andrea Oh, MD; Elizabeth A. Regan, MD, PhD; James C. Ross, PhD; Raul San Jose Estepar, PhD; Joyce Schroeder, MD; Jered Sieren; Berend C. Stoel, PhD; Juerg Tschirren, PhD; Edwin Van Beek, MD, PhD; Bram van Ginneken, PhD; Eva van Rikxoort, PhD; Gonzalo Vegas SanchezFerrero, PhD; Lucas Veitel; George R. Washko, MD; Carla G. Wilson, MS Columbia University, New York, New York: R. Graham Barr, MD, DrPH; John Austin, MD; Belinda D’Souza, MD; Byron Thomashow, MD University of Michigan, Ann Arbor, Michigan: MeiLan K. Han, MD, MS; Ella Kazerooni, MD, MS; Wassim Labaki, MD, MS; Craig Galban, PhD; Dharshan Vummidi, MD Author contributions: Drs. Boueiz and Castaldi had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. COPDGene investigators were instrumental in the design and implementation of the COPDGene study and collected and analyzed data cited in this article. All authors have reviewed, approved and endorsed all content and conclusions of this article. Duke University Medical Center, Durham, North Carolina: Neil MacIntyre, Jr., MD; H. Page McAdams, MD; Lacey Washington, MD University of Iowa, Iowa City, Iowa: Alejandro P. Comellas, MD; Karin F. Hoth, PhD; John Newell, Jr., MD; Brad Thompson, MD Administrative Center: James D. Crapo, MD (PI); Edwin K. Silverman, MD, PhD (PI); Barry J. Make, MD; Elizabeth A. Regan, MD, PhD HealthPartners Research Institute, Minneapolis, Minnesota: Charlene McEvoy, MD, MPH; Joseph Tashjian, MD Minneapolis VA, Minneapolis, Minnesota: Christine Wendt, MD; Brian Bell, MD; Ken M. Kunisaki, MD, MS Reliant Medical Group, Worcester, MA: Richard Rosiello, MD; David Pace, MD Baylor College of Medicine, Houston, Texas: Nicola Hanania, MD, MS; Mustafa Atik, MD; Aladin Boriek, PhD; Kalpatha Guntupalli, MD; Elizabeth Guy, MD; Amit Parulekar, MD Genetic Analysis Center: Terri H. Beaty, PhD; Peter J. Castaldi, MD, MSc; Michael H. Cho, MD, MPH; Dawn L. DeMeo, MD, MPH; Adel Boueiz, MD, MMSc; Marilyn G. Foreman, MD, MS; Auyon Ghosh, MD; Lystra P. Hayden, MD, MMSc; Craig P. Hersh, MD, MPH; Jacqueline Hetmanski, MS; Brian D. Hobbs, MD, MMSc; John E. Hokanson, MPH, PhD; Wonji Kim, PhD; Nan Laird, PhD; Christoph Lange, PhD; Sharon M. Lutz, PhD; Merry-Lynn McDonald, PhD; Dmitry Prokopenko, PhD; Matthew Moll, MD, MPH; Jarrett Morrow, PhD; Dandi Qiao, PhD; Elizabeth A. Regan, MD, PhD; Aabida Saferali, PhD; Phuwanat Sakornsakolpat, MD; Edwin K. Silverman, MD, PhD; Emily S. Wan, MD; Jeong Yun, MD, MPH Biomarker Core: Russell P. Bowler, MD, PhD; Katerina Kechris, PhD; Farnoush BanaeiKashani, PhD University of Pittsburgh, Pittsburgh, Pennsylvania: Frank Sciurba, MD; Jessica Bon, MD; Divay Chandra, MD, MSc; Joel Weissfeld, MD, MPH COPDGene Investigators - Core Units COPDGene Investigators - Clinical Centers Michael E. DeBakey VAMC, Houston, Texas: Amir Sharafkhaneh, MD, PhD; Charlie Lan, DO University of Alabama, Birmingham, Alabama: Mark Dransfield, MD; William Bailey, MD; Surya P. Bhatt, MD; Anand Iyer, MD; Hrudaya Nath, MD; J. Michael Wells, MD Epidemiology Core, University of Colorado Anschutz Medical Campus, Aurora, Colorado: John E. Hokanson, MPH, PhD; Erin Austin, PhD; Gregory Kinney, MPH, PhD; Sharon M. Lutz, PhD; Kendra A. Young, PhD Mortality Adjudication Core: Surya P. Bhatt, MD; Jessica Bon, MD; Alejandro A. Diaz, MD, MPH; MeiLan K. Han, MD, MS; Barry Make, MD; Susan Murray, ScD; Elizabeth Regan, MD; Xavier Soler, MD; Carla G. Wilson, MS |
ISSN: | 2372-952X 2372-952X |
DOI: | 10.15326/jcopdf.2021.0275 |