Identification of taxon through classification with partial reject options

Identification of taxa can significantly be assisted by statistical classification based on trait measurements in two major ways; either individually or by phylogenetic (clustering) methods. In this paper we present a general Bayesian approach for classifying species individually based on measuremen...

Full description

Saved in:
Bibliographic Details
Main Authors Karlsson, Måns, Hössjer, Ola
Format Journal Article
LanguageEnglish
Published 11.06.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Identification of taxa can significantly be assisted by statistical classification based on trait measurements in two major ways; either individually or by phylogenetic (clustering) methods. In this paper we present a general Bayesian approach for classifying species individually based on measurements of a mixture of continuous and ordinal traits as well as any type of covariates. It is assumed that the trait vector is derived from a latent variable with a multivariate Gaussian distribution. Decision rules based on supervised learning are presented that estimate model parameters through blockwise Gibbs sampling. These decision regions allow for uncertainty (partial rejection), so that not necessarily one specific category (taxon) is output when new subjects are classified, but rather a set of categories including the most probable taxa. This type of discriminant analysis employs reward functions with a set-valued input argument, so that an optimal Bayes classifier can be defined. We also present a way of safeguarding against outlying new observations, using an analogue of a $p$-value within our Bayesian setting. Our method is illustrated on an original ornithological data set of birds. We also incorporate model selection through cross-validation, examplified on another original data set of birds.
DOI:10.48550/arxiv.1906.04538