Keyword extraction method and extraction system for internet Chinese text

The invention relates to a keyword extraction method and system for an internet Chinese text. The method comprises the steps of creating a vocabulary composed of words in the text; constructing a keyword candidate set according to the vocabulary; calculating the score of each word in the keyword can...

Full description

Saved in:
Bibliographic Details
Main Authors ZHENG YUFAN, ZHAO CHE
Format Patent
LanguageChinese
English
Published 02.02.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The invention relates to a keyword extraction method and system for an internet Chinese text. The method comprises the steps of creating a vocabulary composed of words in the text; constructing a keyword candidate set according to the vocabulary; calculating the score of each word in the keyword candidate set; calculating the score of each phrase in the keyword candidate set; carrying out duplicate removal on key phrases in the keyword candidate set; and sorting according to the score of each keyword, and outputting one or more keywords with the highest score as the keywords of the text. The system comprises a first construction unit, a second construction unit, a first calculation unit, a second calculation unit, a processing unit and an output unit. The key words aiming at the Chinese text can be extracted, and the key phrases aiming at the Chinese text can also be extracted. 本发明涉及一种针对互联网中文文本的关键词提取方法和系统,所述方法包括:构建出由文本中单词组成的词汇表;根据所述词汇表,构建出关键词候选集;计算出所述关键词候选集中每个单词的得分;计算出所述关键词候选集中每个短语的得分;将所述关键词候选集中的关键短语去重;根据每个所述
Bibliography:Application Number: CN201910666464