Systematic Analysis of Biomolecular Conformational Ensembles with PENSA

Atomic-level simulations are widely used to study biomolecules and their dynamics. A common goal in such studies is to compare simulations of a molecular system under several conditions -- for example, with various mutations or bound ligands -- in order to identify differences between the molecular...

Full description

Saved in:

Bibliographic Details
Main Authors	Vögele, Martin, Thomson, Neil J, Truong, Sang T, McAvity, Jasper, Zachariae, Ulrich, Dror, Ron O
Format	Journal Article
Language	English
Published	05.12.2022
Subjects	Physics - Biological Physics Physics - Computational Physics Quantitative Biology - Biomolecules
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Atomic-level simulations are widely used to study biomolecules and their dynamics. A common goal in such studies is to compare simulations of a molecular system under several conditions -- for example, with various mutations or bound ligands -- in order to identify differences between the molecular conformations adopted under these conditions. However, the large amount of data produced by simulations of ever larger and more complex systems often renders it difficult to identify the structural features that are relevant for a particular biochemical phenomenon. We present a flexible software package named PENSA that enables a comprehensive and thorough investigation into biomolecular conformational ensembles. It provides featurizations and feature transformations that allow for a complete representation of biomolecules like proteins and nucleic acids, including water and ion binding sites, thus avoiding bias that would come with manual feature selection. PENSA implements methods to systematically compare the distributions of molecular features across ensembles to find the significant differences between them and identify regions of interest. It also includes a novel approach to quantify the state-specific information between two regions of a biomolecule, which allows, e.g., tracing information flow to identify allosteric pathways. PENSA also comes with convenient tools for loading data and visualizing results, making them quick to process and easy to interpret. PENSA is an open-source Python library maintained at https://github.com/drorlab/pensa along with an example workflow and a tutorial. We demonstrate its usefulness in real-world examples by showing how it helps to determine molecular mechanisms efficiently.
DOI:	10.48550/arxiv.2212.02714