Major inconsistencies of inferred population genetic structure estimated in a large set of domestic horse breeds using microsatellites

STRUCTURE remains the most applied software aimed at recovering the true, but unknown, population structure from microsatellite or other genetic markers. About 30% of STRUCTURE‐based studies could not be reproduced (Molecular Ecology, 21, 2012, 4925). Here we use a large set of data from 2,323 horse...

Full description

Saved in:

Bibliographic Details
Published in	Ecology and evolution Vol. 10; no. 10; pp. 4261 - 4279
Main Authors	Funk, Stephan Michael, Guedaoura, Sonya, Juras, Rytis, Raziq, Absul, Landolsi, Faouzi, Luís, Cristina, Martínez, Amparo Martínez, Musa Mayaki, Abubakar, Mujica, Fernando, Oom, Maria do Mar, Ouragh, Lahoussine, Stranger, Yves‐Marie, Vega‐Pla, Jose Luis, Cothran, Ernest Gus
Format	Journal Article
Language	English
Published	England John Wiley & Sons, Inc 01.05.2020 John Wiley and Sons Inc Wiley
Subjects	Clustering Datasets domestic horse Domestication Genetic markers Genetic structure Horses Microsatellites Original Research Phylogenetics Phylogeny Population population genetic structure Population genetics Population structure Przewalski horse Software STRUCTURE analysis Europe Africa Przewalski horse STRUCTURE analysis population genetic structure domestic horse
Online Access	Get full text

Cover

Loading…

More Information
Summary:	STRUCTURE remains the most applied software aimed at recovering the true, but unknown, population structure from microsatellite or other genetic markers. About 30% of STRUCTURE‐based studies could not be reproduced (Molecular Ecology, 21, 2012, 4925). Here we use a large set of data from 2,323 horses from 93 domestic breeds plus the Przewalski horse, typed at 15 microsatellites, to evaluate how program settings impact the estimation of the optimal number of population clusters Kopt that best describe the observed data. Domestic horses are suited as a test case as there is extensive background knowledge on the history of many breeds and extensive phylogenetic analyses. Different methods based on different genetic assumptions and statistical procedures (DAPC, FLOCK, PCoA, and STRUCTURE with different run scenarios) all revealed general, broad‐scale breed relationships that largely reflect known breed histories but diverged how they characterized small‐scale patterns. STRUCTURE failed to consistently identify Kopt using the most widespread approach, the ΔK method, despite very large numbers of MCMC iterations (3,000,000) and replicates (100). The interpretation of breed structure over increasing numbers of K, without assuming a Kopt, was consistent with known breed histories. The over‐reliance on Kopt should be replaced by a qualitative description of clustering over increasing K, which is scientifically more honest and has the advantage of being much faster and less computer intensive as lower numbers of MCMC iterations and repetitions suffice for stable results. Very large data sets are highly challenging for cluster analyses, especially when populations with complex genetic histories are investigated. Very large data sets are highly challenging for cluster analyses, especially when populations with complex genetic histories are investigated. Here we use a large set of data from 2,323 horses from 93 domestic breeds plus the Przewalski horse, typed at 15 microsatellites, to evaluate how the settings of the program STRUCTURE, the gold standard for this type of analyses, impact the estimation of the optimal number of population clusters Kopt that best describe the observed data.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	2045-7758 2045-7758
DOI:	10.1002/ece3.6195