Data Mining and Risk Prediction Based on Apriori Improved Algorithm for Lung Cancer

Starting from medical big data, this article uses data mining technology to analyze and study the pathogenic factors of lung cancer based on the lung cancer electronic medical record data from the oncology department of the authoritative third grade A hospital for many years. With respect to the pro...

Full description

Saved in:
Bibliographic Details
Published inJournal of signal processing systems Vol. 93; no. 7; pp. 795 - 809
Main Authors Guo, Hong, Liu, Hong, Chen, JiaYou, Zeng, Yan
Format Journal Article
LanguageEnglish
Published New York Springer US 01.07.2021
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Starting from medical big data, this article uses data mining technology to analyze and study the pathogenic factors of lung cancer based on the lung cancer electronic medical record data from the oncology department of the authoritative third grade A hospital for many years. With respect to the processing of huge data from electronic medical records for lung cancer, traditional serial Apriori algorithm has the disadvantages of scanning database frequently, running slowly and consuming large amount of memory resources. Therefore, an improved Apriori algorithm based on MapReduce distributed computing model of Hadoop platform is proposed. The experimental cluster and lung cancer data mining experiments show that the improved Apriori algorithm has higher execution efficiency and good system scalability in dealing with lung cancer big data, and can well mine the relationship between lung cancer and pathogenic factors, which has important guiding significance for assisting the clinical diagnosis and risk prediction of lung cancer.
ISSN:1939-8018
1939-8115
DOI:10.1007/s11265-021-01663-1