A Parallel Tiled and Sparsified Four-Russians Algorithm for Nussinov's RNA Folding

To enable extensive research on the ribonucleic acid (RNA) molecule, predicting its spatial structure stands as a much-valued research field. In this regard, Nussinov and Jacobson published the (now) de facto solution to predict the halfway secondary structure, which runs in cubic time for an n -nuc...

Full description

Saved in:
Bibliographic Details
Published inIEEE/ACM transactions on computational biology and bioinformatics Vol. 20; no. 3; pp. 1795 - 1806
Main Authors Tchendji, Vianney Kengne, Youmbi, Franklin Ingrid Kamga, Djamegni, Clementin Tayou, Zeutouo, Jerry Lacmou
Format Journal Article
LanguageEnglish
Published United States IEEE 01.05.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:To enable extensive research on the ribonucleic acid (RNA) molecule, predicting its spatial structure stands as a much-valued research field. In this regard, Nussinov and Jacobson published the (now) de facto solution to predict the halfway secondary structure, which runs in cubic time for an n -nucleotide sequence. We design our contribution starting from two of the several works conducted to improve this running time. First, those of Frid and Gusfield, which associate a speedup named sparsification with an on-demand Four-Russians paradigm to achieve the fastest theoretical solution known to date. And second, those of Palkowski and Bielecki, which efficiently restructure the classical Nussinov loop nest with a novel tiling technique. Alongside other loop transformations, this paper shows that, owing to loop restructuring promoting cache reuse and variable-grained parallelism, applying the latter approach to the Frid and Gusfield doubly sped-up loop nest yields outperforming improvements both in sequential and in parallel. In fact, following empirical evaluation, we have obtained relatively to Palkowski's sequential basis, speedups up to x3.26 sequentially and up to x28.50 on 32 threads of a multicore processor with a 30,000-nucleotide sequence. Furthermore, the massive parallel environment in graphics cards has led this speedup factor to reach x44.37.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1545-5963
1557-9964
DOI:10.1109/TCBB.2022.3216826