Identification of key genes and development of an identifying machine learning model for sepsis

Objective and design This study aims to identify key genes of sepsis and construct a model for sepsis identification through integrated multi-organ single-cell RNA sequencing (scRNA-seq) and machine learning. Material or subjects Datasets downloaded from the Gene Expression Omnibus (GSE207363, GSE20...

Full description

Saved in:
Bibliographic Details
Published inInflammation research Vol. 74; no. 1; p. 100
Main Authors Li, Zhonghao, Chen, Shengsong, Gao, Nan, Chen, Jie, Qin, Ying, Zhang, Guoqiang
Format Journal Article
LanguageEnglish
Published Cham Springer International Publishing 01.12.2025
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Objective and design This study aims to identify key genes of sepsis and construct a model for sepsis identification through integrated multi-organ single-cell RNA sequencing (scRNA-seq) and machine learning. Material or subjects Datasets downloaded from the Gene Expression Omnibus (GSE207363, GSE207651, GSE185263, GSE69063 and GSE134347) were used. Methods ScRNA-seq data extracted from heart (GSE207363) and lung tissues (GSE207651) of septic mice were processed and analyzed using the Seurat package in R. Key genes were identified as present in both heart and lung tissues, resulting from the overlap of three analyses along with differential expression analyses. We then used support vector machine recursive feature elimination to construct a model for sepsis identification based on these key genes. The GSE185263 dataset was used for training, while GSE69063 and GSE134347 were used for testing. The accuracy of the model in identifying of sepsis was validated by analyzing the area under the receiver operating characteristic curve (AUROC) using the test datasets. Results Thirteen genes were initially identified as key genes, and after translation to their human homologs, ten genes remained. The optimal SVM-RFE model incorporated eight of these genes ( CAMP, CD74 , HLA-DQA1 , HLA-DQB1, HLA-DMA , HLA-DRB5 , and LYZ ). In the two test datasets, the AUROC value for the accuracy of the model in identifying of sepsis was 0.904 and 0.924, respectively. Conclusions We have identified several key genes and developed a machine learning model for sepsis identification. Further studies are needed to validate our findings.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1023-3830
1420-908X
1420-908X
DOI:10.1007/s00011-025-02068-7