Combination of Key Information Extracting with Spoken Document Classification Based on Lattice

Traditionally, the query words in spoken document classification are generated by manual. Here, based on CHI, TFIDF and maximum poster probability (MPP) features, key information extraction is combined with spoken document classification system, where different class has different topic. From the ex...

Full description

Saved in:
Bibliographic Details
Published inComputer Science for Environmental Engineering and EcoInformatics pp. 236 - 241
Main Authors Zhang, Lei, Zhang, Zhuo, Xiang, Xue-zhi
Format Book Chapter
LanguageEnglish
Published Berlin, Heidelberg Springer Berlin Heidelberg
SeriesCommunications in Computer and Information Science
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Traditionally, the query words in spoken document classification are generated by manual. Here, based on CHI, TFIDF and maximum poster probability (MPP) features, key information extraction is combined with spoken document classification system, where different class has different topic. From the extraction, the weights of the same key word in each topic may be distinct. These weights which reveal the relationship between the word and topic can be taken part in spoken document classification system. Additionally, in the classification system, document length information is adopted when no query is found. The whole classification system is based on lattice, which has more information than 1-best result in speech recognition system. Among CHI, TFIDF and MPP, the system performance of MPP is a little worse than the others. CHI is a little better than TFIDF when the key words number is increasing. Experiments show that when the system is combined weight and document length information, the best performance can achieve 0.769 MAP.
ISBN:9783642226908
3642226906
ISSN:1865-0929
1865-0937
DOI:10.1007/978-3-642-22691-5_41