Imbalanced class distribution and performance evaluation metrics: A systematic review of prediction accuracy for determining model performance in healthcare systems

Focus on predictive algorithm and its performance evaluation is extensively covered in most research studies to determine best or appropriate predictive model with Optimum prediction solution indicated by prediction accuracy score, precision, recall, f1score etc. Prediction accuracy score from perfo...

Full description

Saved in:

Bibliographic Details
Published in	PLOS digital health Vol. 2; no. 11; p. e0000290
Main Authors	Owusu-Adjei, Michael, Ben Hayfron-Acquah, James, Frimpong, Twum, Abdul-Salaam, Gaddafi
Format	Journal Article
Language	English
Published	United States Public Library of Science 01.11.2023 Public Library of Science (PLoS)
Subjects	Artificial intelligence Biology and Life Sciences Breast cancer Computer and Information Sciences Datasets Engineering and Technology Health care Medicine and Health Sciences Neural networks Physical Sciences Prediction models Research and Analysis Methods Support vector machines
Online Access	Get full text
ISSN	2767-3170 2767-3170
DOI	10.1371/journal.pdig.0000290

Cover

More Information
Summary:	Focus on predictive algorithm and its performance evaluation is extensively covered in most research studies to determine best or appropriate predictive model with Optimum prediction solution indicated by prediction accuracy score, precision, recall, f1score etc. Prediction accuracy score from performance evaluation has been used extensively as the main determining metric for performance recommendation. It is one of the most widely used metric for identifying optimal prediction solution irrespective of dataset class distribution context or nature of dataset and output class distribution between the minority and majority variables. The key research question however is the impact of class inequality on prediction accuracy score in such datasets with output class distribution imbalance as compared to balanced accuracy score in the determination of model performance in healthcare and other real-world application systems. Answering this question requires an appraisal of current state of knowledge in both prediction accuracy score and balanced accuracy score use in real-world applications where there is unequal class distribution. Review of related works that highlight the use of imbalanced class distribution datasets with evaluation metrics will assist in contextualizing this systematic review.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 The authors have declared that no competing interests exist.
ISSN:	2767-3170 2767-3170
DOI:	10.1371/journal.pdig.0000290