Predicting numeric ratings for Google apps using text features and ensemble learning

Application (app) ratings are feedback provided voluntarily by users and serve as important evaluation criteria for apps. However, these ratings can often be biased owing to insufficient or missing votes. Additionally, significant differences have been observed between numeric ratings and user revie...

Full description

Saved in:

Bibliographic Details
Published in	ETRI journal Vol. 43; no. 1; pp. 95 - 108
Main Authors	Umer, Muhammad, Ashraf, Imran, Mehmood, Arif, Ullah, Saleem, Choi, Gyu Sang
Format	Journal Article
Language	Korean
Published	한국전자통신연구원 10.02.2021 ETRI
Subjects	ensemble learning Google app rating opinion mining text mining text features data mining
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Application (app) ratings are feedback provided voluntarily by users and serve as important evaluation criteria for apps. However, these ratings can often be biased owing to insufficient or missing votes. Additionally, significant differences have been observed between numeric ratings and user reviews. This study aims to predict the numeric ratings of Google apps using machine learning classifiers. It exploits numeric app ratings provided by users as training data and returns authentic mobile app ratings by analyzing user reviews. An ensemble learning model is proposed for this purpose that considers term frequency/inverse document frequency (TF/IDF) features. Three TF/IDF features, including unigrams, bigrams, and trigrams, were used. The dataset was scraped from the Google Play store, extracting data from 14 different app categories. Biased and unbiased user ratings were discriminated using TextBlob analysis to formulate the ground truth, from which the classifier prediction accuracy was then evaluated. The results demonstrate the high potential for machine learning-based classifiers to predict authentic numeric ratings based on actual user reviews.
Bibliography:	KISTI1.1003/JNL.JAKO202149135472763
ISSN:	1225-6463 2233-7326