Development and verification of prediction models for preventing cardiovascular diseases

Cardiovascular disease (CVD) is one of the major causes of death worldwide. For improved accuracy of CVD prediction, risk classification was performed using national time-series health examination data. The data offers an opportunity to access deep learning (RNN-LSTM), which is widely known as an ou...

Full description

Saved in:
Bibliographic Details
Published inPloS one Vol. 14; no. 9; p. e0222809
Main Authors Sung, Ji Min, Cho, In-Jeong, Sung, David, Kim, Sunhee, Kim, Hyeon Chang, Chae, Myeong-Hun, Kavousi, Maryam, Rueda-Ochoa, Oscar L, Ikram, M Arfan, Franco, Oscar H, Chang, Hyuk-Jae
Format Journal Article
LanguageEnglish
Published United States Public Library of Science 19.09.2019
Public Library of Science (PLoS)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Cardiovascular disease (CVD) is one of the major causes of death worldwide. For improved accuracy of CVD prediction, risk classification was performed using national time-series health examination data. The data offers an opportunity to access deep learning (RNN-LSTM), which is widely known as an outstanding algorithm for analyzing time-series datasets. The objective of this study was to show the improved accuracy of deep learning by comparing the performance of a Cox hazard regression and RNN-LSTM based on survival analysis. We selected 361,239 subjects (age 40 to 79 years) with more than two health examination records from 2002-2006 using the National Health Insurance System-National Health Screening Cohort (NHIS-HEALS). The average number of health screenings (from 2002-2013) used in the analysis was 2.9 ± 1.0. Two CVD prediction models were developed from the NHIS-HEALS data: a Cox hazard regression model and a deep learning model. In an internal validation of the NHIS-HEALS dataset, the Cox regression model showed a highest time-dependent area under the curve (AUC) of 0.79 (95% CI 0.70 to 0.87) for in females and 0.75 (95% CI 0.70 to 0.80) in males at 2 years. The deep learning model showed a highest time-dependent AUC of 0.94 (95% CI 0.91 to 0.97) for in females and 0.96 (95% CI 0.95 to 0.97) in males at 2 years. Layer-wise Relevance Propagation (LRP) revealed that age was the variable that had the greatest effect on CVD, followed by systolic blood pressure (SBP) and diastolic blood pressure (DBP), in that order. The performance of the deep learning model for predicting CVD occurrences was better than that of the Cox regression model. In addition, it was confirmed that the known risk factors shown to be important by previous clinical studies were extracted from the study results using LRP.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Competing Interests: The author, who is currently employed by KT NexR, was a member of the Yonsei University College of Medicine at the time of the study. Therefore, KT NexR is not related to this study. Also, the author, who is currently employed by Selvas AI Inc. participated in the research to develop a deep learning model. Funds were not provided by Selvas AI Inc. Selvas AI Inc conducted the following results through this study: Korean patent 3 cases, Selvy Checkup (marketed Product). This does not alter our adherence to PLOS ONE policies on sharing data and materials.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0222809