Integrative Comparison of Burrows-Wheeler Transform-Based Mapping Algorithm with de Bruijn Graph for Identification of Lung/Liver Cancer-Specific Gene

Cancers of the lung and liver are the top 10 leading causes of cancer death worldwide. Thus, it is essential to identify the genes specifically expressed in these two cancer types to develop new therapeutics. Although many messenger RNA (mRNA) sequencing data related to these cancer cells are availa...

Full description

Saved in:
Bibliographic Details
Published inJournal of microbiology and biotechnology Vol. 32; no. 2; pp. 149 - 159
Main Authors Ajaykumar, Atul, Yang, Jung Jin
Format Journal Article
LanguageEnglish
Published Korea (South) The Korean Society for Microbiology and Biotechnology 28.02.2022
한국미생물·생명공학회
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Cancers of the lung and liver are the top 10 leading causes of cancer death worldwide. Thus, it is essential to identify the genes specifically expressed in these two cancer types to develop new therapeutics. Although many messenger RNA (mRNA) sequencing data related to these cancer cells are available due to the advancement of next-generation sequencing (NGS) technologies, optimized data processing methods need to be developed to identify the novel cancer-specific genes. Here, we conducted an analytical comparison between Bowtie2, a Burrows-Wheeler transform-based alignment tool, and Kallisto, which adopts pseudo alignment based on a transcriptome de Bruijn graph using mRNA sequencing data on normal cells and lung/liver cancer tissues. Before using cancer data, simulated mRNA sequencing reads were generated, and the high Transcripts Per Million (TPM) values were compared. mRNA sequencing reads data on lung/liver cancer cells were also extracted and quantified. While Kallisto could directly give the output in TPM values, Bowtie2 provided the counts. Thus, TPM values were calculated by processing the Sequence Alignment Map (SAM) file in R using package Rsubread and subsequently in python. The analysis of the simulated sequencing data revealed that Kallisto could detect more transcripts and had a higher overlap over Bowtie2. The evaluation of these two data processing methods using the known lung cancer biomarkers concludes that in standard settings without any dedicated quality control, Kallisto is more effective at producing faster and more accurate results than Bowtie2. Such conclusions were also drawn and confirmed with the known biomarkers specific to liver cancer.
ISSN:1017-7825
1738-8872
DOI:10.4014/jmb.2110.10017