Integrative Comparison of Burrows-Wheeler Transform-Based Mapping Algorithm with de Bruijn Graph for Identification of Lung/Liver Cancer-Specific Gene
Cancers of the lung and liver are the top 10 leading causes of cancer death worldwide. Thus, it is essential to identify the genes specifically expressed in these two cancer types to develop new therapeutics. Although many messenger RNA (mRNA) sequencing data related to these cancer cells are availa...
Saved in:
Published in | Journal of microbiology and biotechnology Vol. 32; no. 2; pp. 149 - 159 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Korea (South)
The Korean Society for Microbiology and Biotechnology
28.02.2022
한국미생물·생명공학회 |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Cancers of the lung and liver are the top 10 leading causes of cancer death worldwide. Thus, it is essential to identify the genes specifically expressed in these two cancer types to develop new therapeutics. Although many messenger RNA (mRNA) sequencing data related to these cancer cells are available due to the advancement of next-generation sequencing (NGS) technologies, optimized data processing methods need to be developed to identify the novel cancer-specific genes. Here, we conducted an analytical comparison between Bowtie2, a Burrows-Wheeler transform-based alignment tool, and Kallisto, which adopts pseudo alignment based on a transcriptome de Bruijn graph using mRNA sequencing data on normal cells and lung/liver cancer tissues. Before using cancer data, simulated mRNA sequencing reads were generated, and the high Transcripts Per Million (TPM) values were compared. mRNA sequencing reads data on lung/liver cancer cells were also extracted and quantified. While Kallisto could directly give the output in TPM values, Bowtie2 provided the counts. Thus, TPM values were calculated by processing the Sequence Alignment Map (SAM) file in R using package Rsubread and subsequently in python. The analysis of the simulated sequencing data revealed that Kallisto could detect more transcripts and had a higher overlap over Bowtie2. The evaluation of these two data processing methods using the known lung cancer biomarkers concludes that in standard settings without any dedicated quality control, Kallisto is more effective at producing faster and more accurate results than Bowtie2. Such conclusions were also drawn and confirmed with the known biomarkers specific to liver cancer. |
---|---|
ISSN: | 1017-7825 1738-8872 |
DOI: | 10.4014/jmb.2110.10017 |