Mass tourism web text semantic analysis method based on model fusion
The invention discloses a mass tourism web text semantic analysis method based on model fusion, which comprises the following steps: acquiring a comment data set, and preprocessing data in the data set; performing visual analysis on the data in the data set; carrying out DBSCAN density clustering on...
Saved in:
Main Authors | , , , , , , |
---|---|
Format | Patent |
Language | Chinese English |
Published |
23.09.2022
|
Subjects | |
Online Access | Get full text |
Cover
Summary: | The invention discloses a mass tourism web text semantic analysis method based on model fusion, which comprises the following steps: acquiring a comment data set, and preprocessing data in the data set; performing visual analysis on the data in the data set; carrying out DBSCAN density clustering on the comment data set to obtain a data set D1; a Word2Vec model is utilized to obtain a data set D2; using a Simhash algorithm to obtain a data set D3; obtaining a data set D4 by using an N-Gram language model; integrating data results in the data sets D1-D4 to obtain a data set D5; importing the preprocessed data set D5 into a TF-IDF model and an LDA model, and extracting to obtain keywords and subject terms; the distance between the keyword vector and the subject term vector of each comment is calculated in a word vectorization mode, and words with high results are output according to the distances; and constructing a triple according to mutual combination of the feature words, the hotel names and the hotel types |
---|---|
Bibliography: | Application Number: CN202210772206 |