Method and system for generating amino acid sequence of protein with specific function and attribute by using deep learning technology
The invention discloses a method and a system for generating an amino acid sequence of a protein with specific functions and attributes by using a deep learning technology. The method comprises the following steps: S1, collecting the amino acid sequence of the protein from databases UniParc, Uniprot...
Saved in:
Main Authors | , |
---|---|
Format | Patent |
Language | Chinese English |
Published |
12.07.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The invention discloses a method and a system for generating an amino acid sequence of a protein with specific functions and attributes by using a deep learning technology. The method comprises the following steps: S1, collecting the amino acid sequence of the protein from databases UniParc, UniprotKB and Pfam; step S2, utilizing the amino acid sequence of the protein obtained in the step S1, using an open source machine learning library pytorch to construct a language model based on a transform architecture, and pre-training the language model; s3, collecting amino acid sequences of all proteins and attributes of different types of proteins from a database ProThermDB; s4, forming a control tag by the attributes of the protein, and taking the control tag and the amino acid sequence in the step S3 as a fine tuning data set; s5, hyper-parameters are set for model method fine tuning, and a fine-tuned protein language model is obtained; and S6, inputting the control tag in the step S4 into the protein language mo |
---|---|
Bibliography: | Application Number: CN202410506901 |