Identification of taxon through classification with partial reject options
Identification of taxa can significantly be assisted by statistical classification based on trait measurements in two major ways; either individually or by phylogenetic (clustering) methods. In this paper we present a general Bayesian approach for classifying species individually based on measuremen...
Saved in:
Main Authors | , |
---|---|
Format | Journal Article |
Language | English |
Published |
11.06.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Identification of taxa can significantly be assisted by statistical
classification based on trait measurements in two major ways; either
individually or by phylogenetic (clustering) methods. In this paper we present
a general Bayesian approach for classifying species individually based on
measurements of a mixture of continuous and ordinal traits as well as any type
of covariates. It is assumed that the trait vector is derived from a latent
variable with a multivariate Gaussian distribution. Decision rules based on
supervised learning are presented that estimate model parameters through
blockwise Gibbs sampling. These decision regions allow for uncertainty (partial
rejection), so that not necessarily one specific category (taxon) is output
when new subjects are classified, but rather a set of categories including the
most probable taxa. This type of discriminant analysis employs reward functions
with a set-valued input argument, so that an optimal Bayes classifier can be
defined. We also present a way of safeguarding against outlying new
observations, using an analogue of a $p$-value within our Bayesian setting. Our
method is illustrated on an original ornithological data set of birds. We also
incorporate model selection through cross-validation, examplified on another
original data set of birds. |
---|---|
DOI: | 10.48550/arxiv.1906.04538 |