Exploring Sentiment as a Potential Indicator of Bias in Disease Ontologies
Ontologies are fundamental tools for the organisation and analysis of biomedical data. One of their roles is as controlled domain vocabularies, providing standardised language and categorisation for relevant domain concepts. As such, ontologies frequently include a wealth of natural language metadat...
Saved in:
Published in | 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) pp. 1826 - 1834 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
09.12.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Ontologies are fundamental tools for the organisation and analysis of biomedical data. One of their roles is as controlled domain vocabularies, providing standardised language and categorisation for relevant domain concepts. As such, ontologies frequently include a wealth of natural language metadata including labels and definitions. Since these metadata are usually created by humans, there exists the possibility that conscious and unconscious biases may be reflected in them. Moreover, humans and computers engage directly with these metadata during the course of scientific practice, and therefore any biases or idiosyncrasies may influence work involving the use of these concepts. Previous work has exposed the possibility of bias in ontological representations of disease domains, however there have been no methods developed for automatic or semiautomatic guidance towards bias in ontology metadata. In this article, we develop an approach to explore sentiment analysis as a potential indicator of bias in ontology concept definitions. We evaluate its use on pairs of disease classes from MESH and Human Disease Ontology (DO), comparing and contrasting sentiment scores between them. We use these examples to identify and evaluate a number of outlying examples, relating them to existing literature. We discuss how our approach could be used to guide ontology developers towards outlying and potentially biased language, forming a tool that could be used to evaluate and improve normalisation of ontology metadata. We also discuss the applicability and appropriateness of general-purpose sentiment analysis applied to biomedical texts, and potential influences of bias on computational analysis, in the context of our results. |
---|---|
DOI: | 10.1109/BIBM52615.2021.9669329 |