Tracing the De Novo Origin of Protein-Coding Genes in Yeast
genes are very important for evolutionary innovation. However, how these genes originate and spread remains largely unknown. To better understand this, we rigorously searched for genes in S288C and examined their spread and fixation in the population. Here, we identified 84 genes in S288C since the...
Saved in:
Published in | mBio Vol. 9; no. 4 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
United States
American Society for Microbiology
31.07.2018
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | genes are very important for evolutionary innovation. However, how these genes originate and spread remains largely unknown. To better understand this, we rigorously searched for
genes in
S288C and examined their spread and fixation in the population. Here, we identified 84
genes in
S288C since the divergence with their sister groups. Transcriptome and ribosome profiling data revealed at least 8 (10%) and 28 (33%)
genes being expressed and translated only under specific conditions, respectively. DNA microarray data, based on 2-fold change, showed that 87% of the
genes are regulated during various biological processes, such as nutrient utilization and sporulation. Our comparative and evolutionary analyses further revealed that some factors, including single nucleotide polymorphism (SNP)/indel mutation, high GC content, and DNA shuffling, contribute to the birth of
genes, while domestication and natural selection drive the spread and fixation of these genes. Finally, we also provide evidence suggesting the possible parallel origin of a
gene between
and
Together, our study provides several new insights into the origin and spread of
genes.
Emergence of
genes has occurred in many lineages during evolution, but the birth, spread, and function of these genes remain unresolved. Here we have searched for
genes from
S288C using rigorous methods, which reduced the effects of bad annotation and genomic gaps on the identification of
genes. Through this analysis, we have found 84 new genes originating
from previously noncoding regions, 87% of which are very likely involved in various biological processes. We noticed that 10% and 33% of
genes were only expressed and translated under specific conditions, therefore, verification of
genes through transcriptome and ribosome profiling, especially from limited expression data, may underestimate the number of bona fide new genes. We further show that SNP/indel mutation, high GC content, and DNA shuffling could be involved in the birth of
genes, while domestication and natural selection drive the spread and fixation of these genes. Finally, we provide evidence suggesting the possible parallel origin of a new gene. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 2161-2129 2150-7511 |
DOI: | 10.1128/mBio.01024-18 |