Trans genomic capture and sequencing of primate exomes reveals new targets of positive selection
Comparison of protein-coding DNA sequences from diverse primates can provide insight into these species' evolutionary history and uncover the molecular basis for their phenotypic differences. Currently, the number of available primate reference genomes limits these genome-wide comparisons. Here...
Saved in:
Published in | Genome research Vol. 21; no. 10; pp. 1686 - 1694 |
---|---|
Main Authors | , , , , , , , |
Format | Journal Article |
Language | English |
Published |
United States
Cold Spring Harbor Laboratory Press
01.10.2011
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Comparison of protein-coding DNA sequences from diverse primates can provide insight into these species' evolutionary history and uncover the molecular basis for their phenotypic differences. Currently, the number of available primate reference genomes limits these genome-wide comparisons. Here we use targeted capture methods designed for human to sequence the protein-coding regions, or exomes, of four non-human primate species (three Old World monkeys and one New World monkey). Despite average sequence divergence of up to 4% from the human sequence probes, we are able to capture ~96% of coding sequences. Using a combination of mapping and assembly techniques, we generated high-quality full-length coding sequences for each species. Both the number of nucleotide differences and the distribution of insertion and deletion (indel) lengths indicate that the quality of the assembled sequences is very high and exceeds that of most reference genomes. Using this expanded set of primate coding sequences, we performed a genome-wide scan for genes experiencing positive selection and identified a novel class of adaptively evolving genes involved in the conversion of epithelial cells in skin, hair, and nails to keratin. Interestingly, the genes we identify under positive selection also exhibit significantly increased allele frequency differences among human populations, suggesting that they play a role in both recent and long-term adaptation. We also identify several genes that have been lost on specific primate lineages, which illustrate the broad utility of this data set for other evolutionary analyses. These results demonstrate the power of second-generation sequencing in comparative genomics and greatly expand the repertoire of available primate coding sequences. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 1088-9051 1549-5469 |
DOI: | 10.1101/gr.121327.111 |