A Bayesian approach for analysis of ordered categorical responses subject to misclassification

Ordinal categorical responses are frequently collected in survey studies, human medicine, and animal and plant improvement programs, just to mention a few. Errors in this type of data are neither rare nor easy to detect. These errors tend to bias the inference, reduce the statistical power and ultim...

Full description

Saved in:

Bibliographic Details
Published in	PloS one Vol. 13; no. 12; p. e0208433
Main Authors	Ling, Ashley, Hay, El Hamidi, Aggrey, Samuel E., Rekaya, Romdhane
Format	Journal Article
Language	English
Published	United States Public Library of Science 13.12.2018 Public Library of Science (PLoS)
Subjects	Accuracy Analysis Animals asymmetry Bayes Theorem Bayesian analysis Bayesian theory Beef Beef cattle Bias Binary data Bioinformatics Biology and Life Sciences Body Weight - physiology Breeding Breeding - methods Breeding - statistics & numerical data Categories Cattle Cattle - classification Cattle - genetics Computer simulation Data collection Data processing Datasets Datasets as Topic - classification Datasets as Topic - statistics & numerical data Decision making Errors (Mistakes) Female Genetic Association Studies - statistics & numerical data Genetic Association Studies - veterinary Heritability Markov Chains Meat - statistics & numerical data Medical diagnosis Medicinal plants medicine Medicine and Health Sciences Methods Models, Statistical Parameter estimation Parturition - physiology Phenotype Physical Fitness Physical Sciences plant improvement Power efficiency Pregnancy probability Quantitative Trait, Heritable Reduction Research and Analysis Methods Statistical analysis Statistical inference surveys Variables United States > US Georgia
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Ordinal categorical responses are frequently collected in survey studies, human medicine, and animal and plant improvement programs, just to mention a few. Errors in this type of data are neither rare nor easy to detect. These errors tend to bias the inference, reduce the statistical power and ultimately the efficiency of the decision-making process. Contrarily to the binary situation where misclassification occurs between two response classes, noise in ordinal categorical data is more complex due to the increased number of categories, diversity and asymmetry of errors. Although several approaches have been presented for dealing with misclassification in binary data, only limited practical methods have been proposed to analyze noisy categorical responses. A latent variable model implemented within a Bayesian framework was proposed to analyze ordinal categorical data subject to misclassification using simulated and real datasets. The simulated scenario consisted of a discrete response with three categories and a symmetric error rate of 5% between any two classes. The real data consisted of calving ease records of beef cows. Using real and simulated data, ignoring misclassification resulted in substantial bias in the estimation of genetic parameters and reduction of the accuracy of predicted breeding values. Using our proposed approach, a significant reduction in bias and increase in accuracy ranging from 11% to 17% was observed. Furthermore, most of the misclassified observations (in the simulated data) were identified with a substantially higher probability. Similar results were observed for a scenario with asymmetric misclassification. While the extension to traits with more categories between adjacent classes is straightforward, it could be computationally costly. For traits with high heritability, the performance of the methodology would be expected to improve.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Competing Interests: The authors have declared that no competing interests exist.
ISSN:	1932-6203 1932-6203
DOI:	10.1371/journal.pone.0208433