Clustering in non-parametric multivariate analyses

Non-parametric multivariate analyses of complex ecological datasets are widely used. Following appropriate pre-treatment of the data inter-sample resemblances are calculated using appropriate measures. Ordination and clustering derived from these resemblances are used to visualise relationships amon...

Full description

Saved in:

Bibliographic Details
Published in	Journal of experimental marine biology and ecology Vol. 483; pp. 147 - 155
Main Authors	Clarke, K. Robert, Somerfield, Paul J., Gorley, Raymond N.
Format	Journal Article
Language	English
Published	Elsevier B.V 01.10.2016
Subjects	Brackish Cophenetic correlation Cophenetic distance Divisive clustering Flat clustering Marine Non-parametric multivariate SIMPROF ANE, British Isles, Bristol Channel British Isles ANE, British Isles, England, Severn Estuary Non-parametric multivariate SIMPROF Flat clustering Cophenetic correlation Cophenetic distance Divisive clustering
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Non-parametric multivariate analyses of complex ecological datasets are widely used. Following appropriate pre-treatment of the data inter-sample resemblances are calculated using appropriate measures. Ordination and clustering derived from these resemblances are used to visualise relationships among samples (or variables). Hierarchical agglomerative clustering with group-average (UPGMA) linkage is often the clustering method chosen. Using an example dataset of zooplankton densities from the Bristol Channel and Severn Estuary, UK, a range of existing and new clustering methods are applied and the results compared. Although the examples focus on analysis of samples, the methods may also be applied to species analysis. Dendrograms derived by hierarchical clustering are compared using cophenetic correlations, which are also used to determine optimum β in flexible beta clustering. A plot of cophenetic correlation against original dissimilarities reveals that a tree may be a poor representation of the full multivariate information. UNCTREE is an unconstrained binary divisive clustering algorithm in which values of the ANOSIM R statistic are used to determine (binary) splits in the data, to form a dendrogram. A form of flat clustering, k-R clustering, uses a combination of ANOSIM R and Similarity Profiles (SIMPROF) analyses to determine the optimum value of k, the number of groups into which samples should be clustered, and the sample membership of the groups. Robust outcomes from the application of such a range of differing techniques to the same resemblance matrix, as here, result in greater confidence in the validity of a clustering approach. •Dendrograms may be poor representations of inter-sample dissimilarities.•ANOSIM R and SIMPROF are combined to generate new methods of clustering.•UNCTREE is a binary divisive clustering algorithm.•k-R clustering is a flat clustering algorithm.•Robustness of clustering is assessed by applying different methods to example data.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0022-0981 1879-1697
DOI:	10.1016/j.jembe.2016.07.010