Multi-Resolution Convolutional Recurrent Networks

In sequential learning tasks, recurrent neural network (RNN) has been successfully developed for many years. RNN has achieved a great success in a variety of applications in presence of audio, video, speech and text data. On the other hand, temporal convolutional network (TCN) has recently drawn hig...

Full description

Saved in:

Bibliographic Details
Published in	2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) pp. 2043 - 2048
Main Authors	Chien, Jen-Tzung, Huang, Yu-Min
Format	Conference Proceeding
Language	English
Published	APSIPA 14.12.2021
Subjects	Convolution Information processing Learning systems Recurrent neural networks Stacking Stochastic processes Time series analysis
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In sequential learning tasks, recurrent neural network (RNN) has been successfully developed for many years. RNN has achieved a great success in a variety of applications in presence of audio, video, speech and text data. On the other hand, temporal convolutional network (TCN) has recently drawn high attention in different works. TCN basically achieves comparable performance with RNN, but attractively TCN could work more efficient than RNN due to the parallel computation of one-dimensional convolution. A fundamental issue in sequential learning is to capture the temporal dependencies with different time scales. In this paper, we present a new sequential learning machine called the multi-resolution convolutional recurrent network (MR-CRN), which is a hybrid model of TCN encoder and RNN decoder. Utilizing the representation learned by TCN encoder in different layers with various temporal resolutions, RNN decoder can summarize the contextual information with dif-ferent resolutions and time scales without modifying the original architecture. In the experiments on language modeling and action recognition, the merit of MR-CRN is illustrated for sequential learning and prediction in terms of latent representation, model perplexity and recognition accuracy.
ISSN:	2640-0103