Comparison of algorithms to infer genetic population structure from unlinked molecular markers
Identifying population genetic structure (PGS) is crucial for breeding and conservation. Several clustering algorithms are available to identify the underlying PGS to be used with genetic data of maize genotypes. In this work, six methods to identify PGS from unlinked molecular marker data were comp...
Saved in:
Published in | Statistical applications in genetics and molecular biology Vol. 13; no. 4; pp. 391 - 402 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Germany
De Gruyter
01.08.2014
|
Subjects | |
Online Access | Get full text |
ISSN | 2194-6302 1544-6115 1544-6115 |
DOI | 10.1515/sagmb-2013-0006 |
Cover
Loading…
Summary: | Identifying population genetic structure (PGS) is crucial for breeding and conservation. Several clustering algorithms are available to identify the underlying PGS to be used with genetic data of maize genotypes. In this work, six methods to identify PGS from unlinked molecular marker data were compared using simulated and experimental data consisting of multilocus-biallelic genotypes. Datasets were delineated under different biological scenarios characterized by three levels of genetic divergence among populations (low, medium, and high
) and two numbers of sub-populations (
=3 and
=5). The relative performance of hierarchical and non-hierarchical clustering, as well as model-based clustering (STRUCTURE) and clustering from neural networks (SOM-RP-Q). We use the clustering error rate of genotypes into discrete sub-populations as comparison criterion. In scenarios with great level of divergence among genotype groups all methods performed well. With moderate level of genetic divergence (
=0.2), the algorithms SOM-RP-Q and STRUCTURE performed better than hierarchical and non-hierarchical clustering. In all simulated scenarios with low genetic divergence and in the experimental SNP maize panel (largely unlinked), SOM-RP-Q achieved the lowest clustering error rate. The SOM algorithm used here is more effective than other evaluated methods for sparse unlinked genetic data. |
---|---|
Bibliography: | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 ObjectType-Review-3 content type line 23 |
ISSN: | 2194-6302 1544-6115 1544-6115 |
DOI: | 10.1515/sagmb-2013-0006 |