Method and system for generating amino acid sequence of protein with specific function and attribute by using deep learning technology

The invention discloses a method and a system for generating an amino acid sequence of a protein with specific functions and attributes by using a deep learning technology. The method comprises the following steps: S1, collecting the amino acid sequence of the protein from databases UniParc, Uniprot...

Full description

Saved in:
Bibliographic Details
Main Authors QUAN XUEPING, YAN LINGHUA
Format Patent
LanguageChinese
English
Published 12.07.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The invention discloses a method and a system for generating an amino acid sequence of a protein with specific functions and attributes by using a deep learning technology. The method comprises the following steps: S1, collecting the amino acid sequence of the protein from databases UniParc, UniprotKB and Pfam; step S2, utilizing the amino acid sequence of the protein obtained in the step S1, using an open source machine learning library pytorch to construct a language model based on a transform architecture, and pre-training the language model; s3, collecting amino acid sequences of all proteins and attributes of different types of proteins from a database ProThermDB; s4, forming a control tag by the attributes of the protein, and taking the control tag and the amino acid sequence in the step S3 as a fine tuning data set; s5, hyper-parameters are set for model method fine tuning, and a fine-tuned protein language model is obtained; and S6, inputting the control tag in the step S4 into the protein language mo
Bibliography:Application Number: CN202410506901