SpEpistasis: A sparse approach for three-way epistasis detection

Epistasis detection is a fundamental application in the areas of bioinformatics and biomedicine, providing important insights regarding the relationship between the human genome and the occurrence of certain diseases. Exhaustive epistasis detection approaches are employed to achieve an accurate and...

Full description

Saved in:
Bibliographic Details
Published inJournal of parallel and distributed computing Vol. 195; p. 104989
Main Authors Marques, Diogo, Sousa, Leonel, Ilic, Aleksandar
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.01.2025
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Epistasis detection is a fundamental application in the areas of bioinformatics and biomedicine, providing important insights regarding the relationship between the human genome and the occurrence of certain diseases. Exhaustive epistasis detection approaches are employed to achieve an accurate and deterministic solution, at the cost of high computational complexity, especially when targeting high-order epistasis. While recent works employ vectorization and cache-blocking techniques to alleviate this burden, these solutions are now limited by the maximum performance of the functional units of computing systems. Thus, to further improve the performance of epistasis detection it is necessary to reduce its number of memory transfers and computations. To tackle this issue, this work proposes SpEpistasis, which performs three-way epistasis detection by relying on sparse features, which by only storing the non-zero elements of the dataset, allows for reducing the number of operations needed for epistasis detection. To achieve this goal, a new hybrid format to represent the input dataset is proposed, which stores a subset of the data in the compressed sparse row format. Moreover, new sparse-aware algorithmic approaches are also proposed in order to leverage both the hybrid format and the vector capabilities of current CPUs from Intel, AMD, and ARM. The experimental results show that SpEpistasis provides a speedup up to 3.7× and average speedups of around 1.8× and 1.33× when compared with other state-of-the-art works. •Hybrid sparse-dense format for representing epistasis detection datasets, by relying on Compressed Sparse Row (CSR).•Sparse three-way epistasis detection, with novel algorithmic approaches to exploit the proposed hybrid format.•Vectorization strategy for AVX512, AVX, and ARM SVE, for an efficient execution across different microarchitectures.
ISSN:0743-7315
DOI:10.1016/j.jpdc.2024.104989