Text clustering method and device, electronic equipment and storage medium

The invention discloses a text clustering method and device, electronic equipment and a storage medium, and the method comprises the steps: obtaining target corpus data, and preprocessing the target corpus data to obtain target text data; vectorizing the target text data to obtain a first sentence v...

Full description

Saved in:
Bibliographic Details
Main Authors HUANG JINGXIU, QIU ZHAOYI, LI YICHEN, DING RUOFEI, ZHENG YUNXIANG, WU XIAOMIN, CHEN SHUMIN
Format Patent
LanguageChinese
English
Published 03.11.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The invention discloses a text clustering method and device, electronic equipment and a storage medium, and the method comprises the steps: obtaining target corpus data, and preprocessing the target corpus data to obtain target text data; vectorizing the target text data to obtain a first sentence vector matrix; carrying out dimension reduction processing on the first sentence vector matrix to obtain a second sentence vector matrix; constructing a vocabulary library according to the target text data, and performing topic modeling on the vocabulary library to obtain a probability matrix; splicing the second sentence vector matrix and the probability matrix to obtain a target matrix; and according to the target matrix, performing fitting to obtain a target clustering centroid, and performing text clustering based on the target clustering centroid to obtain a text clustering result. According to the method, the problem that context information of a text is ignored in topic clustering can be relieved, meanwhile,
Bibliography:Application Number: CN202310859085