Research on Text Information Mining Technology of Substation Inspection Based on Improved Jieba

With the development of smart grid, the power system has accumulated a lot of data. The inspection records of substations are mostly manual records. These data are recorded and stored in the power database in the form of text, which is difficult to use. Aiming at the problem that general word segmen...

Full description

Saved in:
Bibliographic Details
Published in2021 International Conference on Wireless Communications and Smart Grid (ICWCSG) pp. 561 - 564
Main Authors Ding, Yi, Teng, Fei, Zhang, Pan, Huo, Xianxu, Sun, Qiao, Qi, Yan
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.08.2021
Subjects
Online AccessGet full text
DOI10.1109/ICWCSG53609.2021.00119

Cover

More Information
Summary:With the development of smart grid, the power system has accumulated a lot of data. The inspection records of substations are mostly manual records. These data are recorded and stored in the power database in the form of text, which is difficult to use. Aiming at the problem that general word segmentation technologies have poor performance in power text recognition, this paper proposes to use TF-IDF algorithm to improve general Jieba word segmentation technology. The TF-IDF algorithm is used to identify and weight the power feature words, and update the data with higher weights to the keyword list, and more important words are retained. This article realizes the effective word segmentation of the text of the substation inspection record. Through comparative experiments with traditional techniques, segmentation technology that improves the accuracy and professionalism.
DOI:10.1109/ICWCSG53609.2021.00119