Prostate cancer prediction using classification algorithms

e13590Background: The object of this study was to evaluate the efficacy of machine learning classification algorithms for prostate cancer prediction using patients' age, CCI score, and medical procedures. Methods: We analyzed a dataset of 112,605 individuals from the IBM MarketScan database bet...

Full description

Saved in:
Bibliographic Details
Published inJournal of clinical oncology Vol. 40; no. 16_suppl; p. e13590
Main Authors Huo, Xingyue, Finkelstein, Joseph
Format Journal Article
LanguageEnglish
Published American Society of Clinical Oncology 01.06.2022
Online AccessGet full text

Cover

Loading…
More Information
Summary:e13590Background: The object of this study was to evaluate the efficacy of machine learning classification algorithms for prostate cancer prediction using patients' age, CCI score, and medical procedures. Methods: We analyzed a dataset of 112,605 individuals from the IBM MarketScan database between 2016 and 2019, deviding them into two groups based on whether they diagnosis with prostate cancer in 2019. Then we retrospectively reviewed these subjects' medical records including the CCI, medical procedure records and ages. Machine learning classifiers consisting of support vector machine (SVM), decision tree (DT), random forest (RF), Extreme Gradient Boosting (XGBoost), and adaptive boosting (Adaboost) were used to analyze the prediction accuracy of prostate cancer. For all machine learning prediction models, 70% of the randomly chosen data samples were used for training, and 30% were used as a test set. The classifier with the highest accuracy were selected for prediction and identify feature importance. Results: The study included 112,605 subjects, of which 19,564 (17.4%) men who diagnosed with prostate cancer in 2019. Overall, The best prediction model for prostate cancer prediction was XGBoost, with the cancer prediction accuracy in the test set of 84%. The "Minor male genital procedures" was the highest scored feature importance, while age is not an important feature in prostate cancer prediction among elderly males. Conclusions: The more urethral/male genital examination procedures an individual has received in the past two years, the more attention should be paid to the risk of having prostate cancer. These results suggest that the analysis of individual medical records using machine learning classification algorithms can increase the prediction rate of prostate cancer.
Bibliography:Abstract Disclosures
ISSN:0732-183X
1527-7755
DOI:10.1200/JCO.2022.40.16_suppl.e13590