PSSP-MFFNet: A Multifeature Fusion Network for Protein Secondary Structure Prediction

Protein secondary structure prediction (PSSP) is a fundamental task in modern bioinformatics research and is particularly important for uncovering the functional mechanisms of proteins. To improve the accuracy of PSSP, various general and essential features generated from amino acid sequences are of...

Full description

Saved in:
Bibliographic Details
Published inACS omega Vol. 9; no. 5; pp. 5985 - 5994
Main Authors Chen, Yifu, Chen, Guanxing, Chen, Calvin Yu-Chian
Format Journal Article
LanguageEnglish
Published United States American Chemical Society 06.02.2024
Online AccessGet full text

Cover

Loading…
More Information
Summary:Protein secondary structure prediction (PSSP) is a fundamental task in modern bioinformatics research and is particularly important for uncovering the functional mechanisms of proteins. To improve the accuracy of PSSP, various general and essential features generated from amino acid sequences are often used for predicting the secondary structure. In this paper, we propose PSSP-MFFNet, a deep learning-based multi-feature fusion network for PSSP, which incorporates a multi-view deep learning architecture with the multiple sequence alignment (MSA) Transformer to efficiently capture global and local features of protein sequences. In practice, PSSP-MFFNet adopts a multi-feature fusion strategy, integrating different features generated from protein sequences, including MSA, sequence information, evolutionary information, and hidden state information. Moreover, we employ the MSA Transformer to interleave row and column attention across the input MSA. A hybrid network architecture of convolutional neural networks and long short-term memory networks is applied to extract high-level features after feature fusion. Furthermore, we introduce a transformer encoder to enhance the extracted high-level features. Comparative experimental results on independent tests demonstrate that PSSP-MFFNet has excellent generalization ability, outperforming other state-of-the-art PSSP models by an average of 1% on public benchmarks, including CASP12, CASP13, CASP14, TEST2018, and CB513. Our method can contribute to a better understanding of the biological functions of proteins, which has significant implications for drug discovery, disease diagnosis, and protein engineering.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2470-1343
2470-1343
DOI:10.1021/acsomega.3c10230