A document topic vector extraction method based on deep learning
The invention relates to a document subject vector extraction method based on deep learning, belonging to the technical field of natural language processing. The method extracts local deep semantic information by using a convolutional neural network, a LSTM model is utilized to learn the temporal in...
Saved in:
Main Authors | , , |
---|---|
Format | Patent |
Language | Chinese English |
Published |
11.12.2018
|
Subjects | |
Online Access | Get full text |
Cover
Summary: | The invention relates to a document subject vector extraction method based on deep learning, belonging to the technical field of natural language processing. The method extracts local deep semantic information by using a convolutional neural network, a LSTM model is utilized to learn the temporal information to make the semantics of vectors more comprehensive, the implicit co-occurrence of contextphrases and document topics are selected to avoid the shortcomings of sentence-based topic vector model for short text, the CNN and LSTM models are organically combined by using an attention mechanism, the deep semantics of context, temporal information and salient information are learned, and a model of document topic vector extraction is constructed more effectively.
本发明涉及种基于深度学习的文档主题向量抽取方法,属于自然语言处理技术领域。本发明方法利用卷积神经网络抽取出具有局部的深层的语义信息,利用LSTM模型将时序信息学习出来,使得向量的语义更加全面,选用上下文短语和文档主题的隐含的共现关系,避免了些基于句子的主题向量模型对于短文本的缺点,利用注意力机制将CNN和LSTM模型有机的结合起来,学习了上下文的深层语义、时序信息和显著信息,更有效的构建了档主题向量抽取的模型。 |
---|---|
Bibliography: | Application Number: CN201810748564 |