Tracing the De Novo Origin of Protein-Coding Genes in Yeast

genes are very important for evolutionary innovation. However, how these genes originate and spread remains largely unknown. To better understand this, we rigorously searched for genes in S288C and examined their spread and fixation in the population. Here, we identified 84 genes in S288C since the...

Full description

Saved in:
Bibliographic Details
Published inmBio Vol. 9; no. 4
Main Authors Wu, Baojun, Knudson, Alicia
Format Journal Article
LanguageEnglish
Published United States American Society for Microbiology 31.07.2018
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:genes are very important for evolutionary innovation. However, how these genes originate and spread remains largely unknown. To better understand this, we rigorously searched for genes in S288C and examined their spread and fixation in the population. Here, we identified 84 genes in S288C since the divergence with their sister groups. Transcriptome and ribosome profiling data revealed at least 8 (10%) and 28 (33%) genes being expressed and translated only under specific conditions, respectively. DNA microarray data, based on 2-fold change, showed that 87% of the genes are regulated during various biological processes, such as nutrient utilization and sporulation. Our comparative and evolutionary analyses further revealed that some factors, including single nucleotide polymorphism (SNP)/indel mutation, high GC content, and DNA shuffling, contribute to the birth of genes, while domestication and natural selection drive the spread and fixation of these genes. Finally, we also provide evidence suggesting the possible parallel origin of a gene between and Together, our study provides several new insights into the origin and spread of genes. Emergence of genes has occurred in many lineages during evolution, but the birth, spread, and function of these genes remain unresolved. Here we have searched for genes from S288C using rigorous methods, which reduced the effects of bad annotation and genomic gaps on the identification of genes. Through this analysis, we have found 84 new genes originating from previously noncoding regions, 87% of which are very likely involved in various biological processes. We noticed that 10% and 33% of genes were only expressed and translated under specific conditions, therefore, verification of genes through transcriptome and ribosome profiling, especially from limited expression data, may underestimate the number of bona fide new genes. We further show that SNP/indel mutation, high GC content, and DNA shuffling could be involved in the birth of genes, while domestication and natural selection drive the spread and fixation of these genes. Finally, we provide evidence suggesting the possible parallel origin of a new gene.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2161-2129
2150-7511
DOI:10.1128/mBio.01024-18