A factorization network based method for multi-lingual domain classification

In many spoken language understanding systems (SLUS), domain classification is the most crucial component, as system responses based on wrong domains often yield very unpleasant user experiences. In multi-lingual domain classification, the training data for some poor-resource languages often comes f...

Full description

Saved in:

Bibliographic Details
Published in	2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 5276 - 5280
Main Authors	Yangyang Shi, Yi-Cheng Pan, Mei-Yuh Hwang, Kaisheng Yao, Hu Chen, Yuanhang Zou, Baolin Peng
Format	Conference Proceeding
Language	English
Published	IEEE 01.04.2015
Subjects	Artificial neural networks Domain Classification Error analysis Factorization Networks Polynomials Spoken Language Understanding Support vector machines Training Training data
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In many spoken language understanding systems (SLUS), domain classification is the most crucial component, as system responses based on wrong domains often yield very unpleasant user experiences. In multi-lingual domain classification, the training data for some poor-resource languages often comes from machine translation. Some of the higher order n-gram features are distorted during machine translation. Feature co-occurrence becomes reliable feature in multi-lingual domain classification. In this paper, in order to effectively model feature co-occurrences, we propose Factorization Networks that are combinations of Factorization Machines (FMs) with Neural Networks (NNs). FNs extend the linear connections from the input feature layer to the hidden layer in NNs to factorization connections that represent the weights of feature co-occurrences using factorized method. In addition to FNs, we also propose a hybrid model that integrates FNs, NNs and Maximum Entropy (ME) models together. The component models in the hybrid model share the same input features. Based on two data sets (ATIS data set and Microsoft Cortana Chinese data ), the proposed models shows promising results. Especially for large Microsoft Cortana Chinese data which is translated from well annotated English data, FNs using unigram, class and query length features achieve more than 20% relative error reduction over linear (SVMs).
ISSN:	1520-6149 2379-190X
DOI:	10.1109/ICASSP.2015.7178978