Text clustering method and device, electronic equipment and storage medium
The invention discloses a text clustering method and device, electronic equipment and a storage medium, and the method comprises the steps: obtaining target corpus data, and preprocessing the target corpus data to obtain target text data; vectorizing the target text data to obtain a first sentence v...
Saved in:
Main Authors | , , , , , , |
---|---|
Format | Patent |
Language | Chinese English |
Published |
03.11.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The invention discloses a text clustering method and device, electronic equipment and a storage medium, and the method comprises the steps: obtaining target corpus data, and preprocessing the target corpus data to obtain target text data; vectorizing the target text data to obtain a first sentence vector matrix; carrying out dimension reduction processing on the first sentence vector matrix to obtain a second sentence vector matrix; constructing a vocabulary library according to the target text data, and performing topic modeling on the vocabulary library to obtain a probability matrix; splicing the second sentence vector matrix and the probability matrix to obtain a target matrix; and according to the target matrix, performing fitting to obtain a target clustering centroid, and performing text clustering based on the target clustering centroid to obtain a text clustering result. According to the method, the problem that context information of a text is ignored in topic clustering can be relieved, meanwhile, |
---|---|
Bibliography: | Application Number: CN202310859085 |