Individual risk and prognostic value prediction by machine learning for distant metastasis in pulmonary sarcomatoid carcinoma: a large cohort study based on the SEER database and the Chinese population

This study aimed to develop diagnostic and prognostic models for patients with pulmonary sarcomatoid carcinoma (PSC) and distant metastasis (DM). Patients from the Surveillance, Epidemiology, and End Results (SEER) database were divided into a training set and internal test set at a ratio of 7 to 3,...

Full description

Saved in:
Bibliographic Details
Published inFrontiers in oncology Vol. 13; p. 1105224
Main Authors Yi, Xinglin, Xu, Wenhao, Tang, Guihua, Zhang, Lingye, Wang, Kaishan, Luo, Hu, Zhou, Xiangdong
Format Journal Article
LanguageEnglish
Published Switzerland Frontiers Media S.A 26.06.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This study aimed to develop diagnostic and prognostic models for patients with pulmonary sarcomatoid carcinoma (PSC) and distant metastasis (DM). Patients from the Surveillance, Epidemiology, and End Results (SEER) database were divided into a training set and internal test set at a ratio of 7 to 3, while those from the Chinese hospital were assigned to the external test set, to develop the diagnostic model for DM. Univariate logistic regression was employed in the training set to screen for DM-related risk factors, which were included into six machine learning (ML) models. Furthermore, patients from the SEER database were randomly divided into a training set and validation set at a ratio of 7 to 3 to develop the prognostic model which predicts survival of patients PSC with DM. Univariate and multivariate Cox regression analyses have also been performed in the training set to identify independent factors, and a prognostic nomogram for cancer-specific survival (CSS) for PSC patients with DM. For the diagnostic model for DM, 589 patients with PSC in the training set, 255 patients in the internal and 94 patients in the external test set were eventually enrolled. The extreme gradient boosting (XGB) algorithm performed best on the external test set with an area under the curve (AUC) of 0.821. For the prognostic model, 270 PSC patients with DM in the training and 117 patients in the test set were enrolled. The nomogram displayed precise accuracy with AUC of 0.803 for 3-month CSS and 0.869 for 6-month CSS in the test set. The ML model accurately identified individuals at high risk for DM who needed more careful follow-up, including appropriate preventative therapeutic strategies. The prognostic nomogram accurately predicted CSS in PSC patients with DM.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Edited by: Chuanming Li, Chongqing University Central Hospital, China
Reviewed by: Kathiravan Srinivasan, Vellore Institute of Technology, Vellore, India; Yinlong Yuan, Nantong University, China
ISSN:2234-943X
2234-943X
DOI:10.3389/fonc.2023.1105224