Stop word data augmentation for natural language processing

A stop word data augmentation technique for training a chat robot system in natural language processing. In one particular aspect, a computer-implemented method includes receiving a training set of utterances for training an intent classifier to identify intent of an utterance; augmenting the uttera...

Full description

Saved in:
Bibliographic Details
Main Authors BISHNOI, VISHNU, SINGARAJU GAUTAM, JOHNSON MARTIN E, JALALUDDIN EMA L, VINNAKOTA BALAJI SHANKAR, DUONG THANH LONG
Format Patent
LanguageChinese
English
Published 29.04.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A stop word data augmentation technique for training a chat robot system in natural language processing. In one particular aspect, a computer-implemented method includes receiving a training set of utterances for training an intent classifier to identify intent of an utterance; augmenting the utterance training set with stop words to generate an augmented out-of-domain utterance training set for an unparsed intent category corresponding to the unparsed intent; and training the intent classifier using the utterance training set and the expanded out-of-domain utterance training set. The augmentation includes selecting an utterance from the training set of utterances, and for each selected utterance, retaining existing stop words within the utterance and replacing at least one non-stop word within the utterance with a stop word or a stop word phrase selected from a list of stop words to generate an out-of-domain utterance. 用于在自然语言处理中训练聊天机器人系统的停用词数据扩充技术。在一个特定方面,一种计算机实施的方法包括接收话语训练集,所述话语训练集用于训练意图分类器以识别话语的意图;用停用词扩充所
Bibliography:Application Number: CN202080064541