Spam review detection with Metapath-aggregated graph convolution network

The large flux of online products in today’s world makes business reviews a valuable source for consumers for making sound decisions before making online purchases. Reviews are useful for readers in learning more about the product and gauge its quality. Fake reviews and reviewers form the bulk of th...

Full description

Saved in:
Bibliographic Details
Published inJournal of intelligent & fuzzy systems Vol. 45; no. 2; pp. 3005 - 3023
Main Authors Jayashree, P., Laila, K., Amuthan, Aara
Format Journal Article
LanguageEnglish
Published Amsterdam IOS Press BV 01.01.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The large flux of online products in today’s world makes business reviews a valuable source for consumers for making sound decisions before making online purchases. Reviews are useful for readers in learning more about the product and gauge its quality. Fake reviews and reviewers form the bulk of the review corpus, making review spamming an open research challenge. These spam reviews require detection to nullify their contribution to product recommendations. In the past, researchers and communities have taken spam detection problems as a matter of serious concern. Yet, for all that, there is space for the performance of exploration on large-scale complex datasets. The work contributes towards robust feature selection with derived features that provide more details on malicious reviews and spammers. Ensemble and other standard machine learning techniques are trained and evaluated over optimal feature sets. In addition, the Metapath-based Graph Convolution Network (M-GCN) framework is proposed, which is an implicit knowledge extraction method to automatically capture the complex semantic meaning of reviews from the heterogeneous network. It makes analysis of triplet (users, reviews, and products) relationships in e-commerce sites through examination of Top-n feature sets in a mutually reinforcing manner. The proposed model is demonstrated on Yelp and Amazon benchmark datasets for evaluation of efficacy and it is shown outperforming state-of-the-art techniques with and without graph-utilization, providing an accuracy of 96% in the prediction task.
ISSN:1064-1246
1875-8967
DOI:10.3233/JIFS-223136