Semantic classification method for network Tibetan corpus

Tibetan web pages appear enormously. It is meaningful that the information processing technology is utilized to find the useful knowledge from the Tibetan web information. Tibetan semantic ontology can enrich the Tibetan digital resource and is helpful to improve the information processing performan...

Full description

Saved in:

Bibliographic Details
Published in	Cluster computing Vol. 20; no. 1; pp. 155 - 165
Main Authors	Xu, Gui-Xian, Wang, Chang-Zhi, Wang, Li-Hui, Zhou, Yu-Hong, Li, Wei-Kang, Xu, Hao, Huang, Qing
Format	Journal Article
Language	English
Published	New York Springer US 01.03.2017 Springer Nature B.V
Subjects	Accuracy Algorithms Classification Computer Communication Networks Computer Science Data collection Data processing Information processing Information resources Information retrieval Information technology Internet resources Methods Ontology Operating Systems Processor Architectures Semantics Text categorization Websites Concept similarity Semantic ontology Tibetan information processing Semantic classification
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Tibetan web pages appear enormously. It is meaningful that the information processing technology is utilized to find the useful knowledge from the Tibetan web information. Tibetan semantic ontology can enrich the Tibetan digital resource and is helpful to improve the information processing performance. In this paper, semantic classification of Tibetan network corpus is studied. Firstly Tibetan web pages are collected. Secondly preprocessing is conducted to extract the useful information from Web pages. Thirdly the word segmentation and text representation are introduced. Finally the text similarity classification algorithm is proposed to classify the text. During the experiment, the comparison between semantic classification and non semantic classification is conducted. The results show that the semantic classification performance is obviously superior to non semantic classification. This means that making full use of ontology semantic relationship can greatly enhance the classification accuracy. The research is useful and helpful to the study of Tibetan semantic information processing.
ISSN:	1386-7857 1573-7543
DOI:	10.1007/s10586-017-0742-6