Exploring Sentiment as a Potential Indicator of Bias in Disease Ontologies

Ontologies are fundamental tools for the organisation and analysis of biomedical data. One of their roles is as controlled domain vocabularies, providing standardised language and categorisation for relevant domain concepts. As such, ontologies frequently include a wealth of natural language metadat...

Full description

Saved in:

Bibliographic Details
Published in	2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) pp. 1826 - 1834
Main Authors	Slater, Luke T., Williams, John A., Schofield, Paul N., Gkoutos, Georgios V.
Format	Conference Proceeding
Language	English
Published	IEEE 09.12.2021
Subjects	bias Computers Conferences disease Metadata Natural languages Ontologies ontology Sentiment analysis Vocabulary
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Ontologies are fundamental tools for the organisation and analysis of biomedical data. One of their roles is as controlled domain vocabularies, providing standardised language and categorisation for relevant domain concepts. As such, ontologies frequently include a wealth of natural language metadata including labels and definitions. Since these metadata are usually created by humans, there exists the possibility that conscious and unconscious biases may be reflected in them. Moreover, humans and computers engage directly with these metadata during the course of scientific practice, and therefore any biases or idiosyncrasies may influence work involving the use of these concepts. Previous work has exposed the possibility of bias in ontological representations of disease domains, however there have been no methods developed for automatic or semiautomatic guidance towards bias in ontology metadata. In this article, we develop an approach to explore sentiment analysis as a potential indicator of bias in ontology concept definitions. We evaluate its use on pairs of disease classes from MESH and Human Disease Ontology (DO), comparing and contrasting sentiment scores between them. We use these examples to identify and evaluate a number of outlying examples, relating them to existing literature. We discuss how our approach could be used to guide ontology developers towards outlying and potentially biased language, forming a tool that could be used to evaluate and improve normalisation of ontology metadata. We also discuss the applicability and appropriateness of general-purpose sentiment analysis applied to biomedical texts, and potential influences of bias on computational analysis, in the context of our results.
DOI:	10.1109/BIBM52615.2021.9669329