Summarizing Online Movie Reviews: A Machine Learning Approach to Big Data Analytics
Information is exploding on the web at exponential pace, so online movie review is becoming a substantial information resource for online users. However, users post millions of movie reviews on regular basis, and it is not possible for users to summarize the reviews. Movie review classification and...
Saved in:
Published in | Scientific programming Vol. 2020; no. 2020; pp. 1 - 14 |
---|---|
Main Authors | , , , , , , |
Format | Journal Article |
Language | English |
Published |
Cairo, Egypt
Hindawi Publishing Corporation
2020
Hindawi Hindawi Limited |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Information is exploding on the web at exponential pace, so online movie review is becoming a substantial information resource for online users. However, users post millions of movie reviews on regular basis, and it is not possible for users to summarize the reviews. Movie review classification and summarization is one of the challenging tasks in natural language processing. Therefore, an automatic approach is demanded to summarize the vast amount of movie reviews, and it will allow the users to speedily distinguish the positive and negative aspects of a movie. This study has proposed an approach for movie review classification and summarization. For movie review classification, bag-of-words feature extraction technique is used to extract unigrams, bigrams, and trigrams as a feature set from given review documents, and represent the review documents as a vector space model. Next, the Naïve Bayes algorithm is employed to classify the movie reviews (represented as a feature vector) into positive and negative reviews. For the task of movie review summarization, Word2vec feature extraction technique is used to extract features from classified movie review sentences, and then semantic clustering technique is used to cluster semantically related review sentences. Different text features are used to calculate the salience score of each review sentence in clusters. Finally, the top-ranked sentences are chosen based on highest salience scores to produce the extractive summary of movie reviews. Experimental results reveal that the proposed machine learning approach is superior than other state-of-the-art approaches. |
---|---|
ISSN: | 1058-9244 1875-919X |
DOI: | 10.1155/2020/5812715 |