Classification of Long Non-Coding RNAs s Between Early and Late Stage of Liver Cancers From Non-coding RNA Profiles Using Machine-Learning Approach

Long non-coding RNAs (lncRNAs), which are RNA sequences greater than 200 nucleotides in length, play a crucial role in regulating gene expression and biological processes associated with cancer development and progression. Liver cancer is a major cause of cancer-related mortality worldwide, notably...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics and biology insights Vol. 18; p. 11779322241258586
Main Authors Anuntakarun, Songtham, Khamjerm, Jakkrit, Tangkijvanich, Pisit, Chuaypen, Natthaya
Format Journal Article
LanguageEnglish
Published London, England SAGE Publications 01.01.2024
Sage Publications Ltd
SAGE Publishing
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Long non-coding RNAs (lncRNAs), which are RNA sequences greater than 200 nucleotides in length, play a crucial role in regulating gene expression and biological processes associated with cancer development and progression. Liver cancer is a major cause of cancer-related mortality worldwide, notably in Thailand. Although machine learning has been extensively used in analyzing RNA-sequencing data for advanced knowledge, the identification of potential lncRNA biomarkers for cancer, particularly focusing on lncRNAs as molecular biomarkers in liver cancer, remains comparatively limited. In this study, our objective was to identify candidate lncRNAs in liver cancer. We employed an expression data set of lncRNAs from patients with liver cancer, which comprised 40 699 lncRNAs sourced from The CancerLivER database. Various feature selection methods and machine-learning approaches were used to identify these candidate lncRNAs. The results showed that the random forest algorithm could predict lncRNAs using features extracted from the database, which achieved an area under the curve (AUC) of 0.840 for classifying lncRNAs between early (stage 1) and late stages (stages 2, 3, and 4) of liver cancer. Five of 23 significant lncRNAs (WAC-AS1, MAPKAPK5-AS1, ARRDC1-AS1, AC133528.2, and RP11-1094M14.11) were differentially expressed between early and late stage of liver cancer. Based on the Gene Expression Profiling Interactive Analysis (GEPIA) database, higher expression of WAC-AS1, MAPKAPK5-AS1, and ARRDC1-AS1 was associated with shorter overall survival. In conclusion, the classification model could predict the early and late stages of liver cancer using the signature expression of lncRNA genes. The identified lncRNAs might be used as early diagnostic and prognostic biomarkers for patients with liver cancer.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1177-9322
1177-9322
DOI:10.1177/11779322241258586