关系挖掘驱动的视频描述自动生成

Video description has received increased interest in the field of computer vision. The process of generating video descriptions needs the technology of natural language processing, and the capacity to allow both the lengths of input (sequence of video frames) and output (sequence of description word...

Full description

Saved in:

Bibliographic Details
Published in	Nanjing Xinxi Gongcheng Daxue Xuebao Vol. 9; no. 6; pp. 642 - 649
Main Authors	Huang, Yi, Bao, Bingkun, Xu, Changsheng
Format	Journal Article
Language	Chinese
Published	Nanjing Nanjing University of Information Science & Technology 01.12.2017 中国科学院自动化研究所模式识别国家重点实验室,北京,100190 中国科学院大学,北京,100049
Subjects	Artificial neural networks Computer vision Convolution Feature extraction Frames Knowledge representation Machine translation Natural language Natural language processing Neural networks representation learning 视频描述 LSTM model video description 表示学习特征嵌入 LSTM模型 feature embedding
Online Access	Get full text
ISSN	1674-7070
DOI	10.13878/j.cnki.jnuist.2017.06.008

Cover

Loading…

More Information
Summary:	Video description has received increased interest in the field of computer vision. The process of generating video descriptions needs the technology of natural language processing, and the capacity to allow both the lengths of input (sequence of video frames) and output (sequence of description words) to be variable. To this end, this paper uses the recent advances in machine translation ,and designs a two-layer LSTM (Long Short-Term Memory) model based on the encoder-decoder architecture. Since the deep neural network can learn appropriate representation of input data, we extract the feature vectors of the video frames by convolution neural network (CNN) and take them as the input sequence of the LSTM model. Finally, we compare the influences of different feature extraction methods on the LSTM video description model. The results show that the model in this paper is able to learn to transform sequence of knowledge representation to natural language.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1674-7070
DOI:	10.13878/j.cnki.jnuist.2017.06.008