Using machine learning to predict COVID-19 infection and severity risk among 4510 aged adults: a UK Biobank cohort study

Many risk factors have emerged for novel 2019 coronavirus disease (COVID-19). It is relatively unknown how these factors collectively predict COVID-19 infection risk, as well as risk for a severe infection (i.e., hospitalization). Among aged adults (69.3 ± 8.6 years) in UK Biobank, COVID-19 data was...

Full description

Saved in:

Bibliographic Details
Published in	Scientific reports Vol. 12; no. 1; pp. 7736 - 11
Main Authors	Willette, Auriel A., Willette, Sara A., Wang, Qian, Pappas, Colleen, Klinedinst, Brandon S., Le, Scott, Larsen, Brittany, Pollpeter, Amy, Li, Tianqi, Mochel, Jonathan P., Allenspach, Karin, Brenner, Nicole, Waterboer, Tim
Format	Journal Article
Language	English
Published	London Nature Publishing Group UK 11.05.2022 Nature Publishing Group Nature Portfolio
Subjects	631/114/2413 631/250 692/499 692/53 692/699 692/699/255/2514 Adult Biobanks Biological Specimen Banks Body mass Cohort analysis Cohort Studies Coronaviruses COVID-19 COVID-19 - diagnosis COVID-19 - epidemiology Cytomegalovirus Discriminant analysis Health risks Hospitalization Humanities and Social Sciences Humans Infections Infectious diseases Lipids Machine Learning Middle Aged multidisciplinary Public health Retrospective Studies Risk Factors SARS-CoV-2 Science Science (multidisciplinary) Serology United Kingdom - epidemiology United Kingdom United Kingdom > UK
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Many risk factors have emerged for novel 2019 coronavirus disease (COVID-19). It is relatively unknown how these factors collectively predict COVID-19 infection risk, as well as risk for a severe infection (i.e., hospitalization). Among aged adults (69.3 ± 8.6 years) in UK Biobank, COVID-19 data was downloaded for 4510 participants with 7539 test cases. We downloaded baseline data from 10 to 14 years ago, including demographics, biochemistry, body mass, and other factors, as well as antibody titers for 20 common to rare infectious diseases in a subset of 80 participants with 124 test cases. Permutation-based linear discriminant analysis was used to predict COVID-19 risk and hospitalization risk. Probability and threshold metrics included receiver operating characteristic curves to derive area under the curve (AUC), specificity, sensitivity, and quadratic mean. Model predictions using the full cohort were marginal. The “best-fit” model for predicting COVID-19 risk was found in the subset of participants with antibody titers, which achieved excellent discrimination (AUC 0.969, 95% CI 0.934–1.000). Factors included age, immune markers, lipids, and serology titers to common pathogens like human cytomegalovirus. The hospitalization “best-fit” model was more modest (AUC 0.803, 95% CI 0.663–0.943) and included only serology titers, again in the subset group. Accurate risk profiles can be created using standard self-report and biomedical data collected in public health and medical settings. It is also worthwhile to further investigate if prior host immunity predicts current host immunity to COVID-19.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2045-2322 2045-2322
DOI:	10.1038/s41598-022-07307-z