Summarizing Online Movie Reviews: A Machine Learning Approach to Big Data Analytics

Information is exploding on the web at exponential pace, so online movie review is becoming a substantial information resource for online users. However, users post millions of movie reviews on regular basis, and it is not possible for users to summarize the reviews. Movie review classification and...

Full description

Saved in:
Bibliographic Details
Published inScientific programming Vol. 2020; no. 2020; pp. 1 - 14
Main Authors Zaindin, Mazen, Ahmad, Shafiq, Shah, Syed Atif Ali, Uddin, M. Irfan, Gul, Muhammad Adnan, Khan, Atif, Al Firdausi, Muhammad Dzulqarnain
Format Journal Article
LanguageEnglish
Published Cairo, Egypt Hindawi Publishing Corporation 2020
Hindawi
Hindawi Limited
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Information is exploding on the web at exponential pace, so online movie review is becoming a substantial information resource for online users. However, users post millions of movie reviews on regular basis, and it is not possible for users to summarize the reviews. Movie review classification and summarization is one of the challenging tasks in natural language processing. Therefore, an automatic approach is demanded to summarize the vast amount of movie reviews, and it will allow the users to speedily distinguish the positive and negative aspects of a movie. This study has proposed an approach for movie review classification and summarization. For movie review classification, bag-of-words feature extraction technique is used to extract unigrams, bigrams, and trigrams as a feature set from given review documents, and represent the review documents as a vector space model. Next, the Naïve Bayes algorithm is employed to classify the movie reviews (represented as a feature vector) into positive and negative reviews. For the task of movie review summarization, Word2vec feature extraction technique is used to extract features from classified movie review sentences, and then semantic clustering technique is used to cluster semantically related review sentences. Different text features are used to calculate the salience score of each review sentence in clusters. Finally, the top-ranked sentences are chosen based on highest salience scores to produce the extractive summary of movie reviews. Experimental results reveal that the proposed machine learning approach is superior than other state-of-the-art approaches.
ISSN:1058-9244
1875-919X
DOI:10.1155/2020/5812715