GSEApy: a comprehensive package for performing gene set enrichment analysis in Python

Abstract Motivation Gene set enrichment analysis (GSEA) is a commonly used algorithm for characterizing gene expression changes. However, the currently available tools used to perform GSEA have a limited ability to analyze large datasets, which is particularly problematic for the analysis of single-...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics (Oxford, England) Vol. 39; no. 1
Main Authors Fang, Zhuoqing, Liu, Xinyuan, Peltz, Gary
Format Journal Article
LanguageEnglish
Published England Oxford University Press 01.01.2023
Oxford Publishing Limited (England)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Abstract Motivation Gene set enrichment analysis (GSEA) is a commonly used algorithm for characterizing gene expression changes. However, the currently available tools used to perform GSEA have a limited ability to analyze large datasets, which is particularly problematic for the analysis of single-cell data. To overcome this limitation, we developed a GSEA package in Python (GSEApy), which could efficiently analyze large single-cell datasets. Results We present a package (GSEApy) that performs GSEA in either the command line or Python environment. GSEApy uses a Rust implementation to enable it to calculate the same enrichment statistic as GSEA for a collection of pathways. The Rust implementation of GSEApy is 3-fold faster than the Numpy version of GSEApy (v0.10.8) and uses >4-fold less memory. GSEApy also provides an interface between Python and Enrichr web services, as well as for BioMart. The Enrichr application programming interface enables GSEApy to perform over-representation analysis for an input gene list. Furthermore, GSEApy consists of several tools, each designed to facilitate a particular type of enrichment analysis. Availability and implementation The new GSEApy with Rust extension is deposited in PyPI: https://pypi.org/project/gseapy/. The GSEApy source code is freely available at https://github.com/zqfang/GSEApy. Also, the documentation website is available at https://gseapy.rtfd.io/. Supplementary information Supplementary data are available at Bioinformatics online.
Bibliography:SourceType-Scholarly Journals-1
content type line 14
ObjectType-Report-1
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
ISSN:1367-4811
1367-4803
1367-4811
DOI:10.1093/bioinformatics/btac757