English corpus and literary analysis based on statistical language model

In this paper, the cross-language retrieval model based on statistical language model, cross-lingual text categorization method and cross-lingual text clustering method are studied systematically and deeply. Without any help of cross-lingual resources such as machine translation and bilingual dictio...

Full description

Saved in:

Bibliographic Details
Published in	Cluster computing Vol. 22; no. Suppl 6; pp. 14897 - 14903
Main Authors	Huang, Bo, Lan, Xijun
Format	Journal Article
Language	English
Published	New York Springer US 01.11.2019 Springer Nature B.V
Subjects	Algorithms Bilingual dictionaries Bilingualism Boolean Clustering Computer Communication Networks Computer Science Corpus analysis Documents Fuzzy sets Information retrieval Internet Language modeling Languages Literary criticism Literary translation Machine translation Native languages Operating Systems Parallel corpora Probability Processor Architectures Queries Semantics Translation methods and strategies Vector space Words (language) Statistical language model Machine translation English corpus
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, the cross-language retrieval model based on statistical language model, cross-lingual text categorization method and cross-lingual text clustering method are studied systematically and deeply. Without any help of cross-lingual resources such as machine translation and bilingual dictionaries, this paper can solve the many-to-many problem of word translation in CLIR and solve the problem of unregistered words partially. Under a unified framework, a series of topics are extracted from bilingual parallel corpora to form the thematic space for each language. Thematic space of each language exists independently, and the bilingual subject space is established through the bilingual semantic correspondence. The bilingual subject space reflects the semantic correspondence between documents and documents, words and words. It reveals the inherent structure and internal relations among languages and languages.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1386-7857 1573-7543
DOI:	10.1007/s10586-018-2454-y