Agricultural Ontology Based Feature Optimization for Agricultural Text Clustering

Feature optimization is important to agricultural text mining. Usually, the vector space model is used to represent text documents. However, this basic approach still suffers from two drawbacks: the curse of dimension and the lack of semantic information. In this paper, a novel ontology-based featur...

Full description

Saved in:
Bibliographic Details
Published inJournal of Integrative Agriculture Vol. 11; no. 5; pp. 752 - 759
Main Authors SU, Ya-ru, WANG, Ru-jing, CHEN, Peng, WEI, Yuan-yuan, LI, Chuan-xi, HU, Yi-min
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.05.2012
Science Press
School of Information Science and Technology, University of Science and Technology of China, Hefei 230026, P.R.China
Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei 230031, P.R.China
Elsevier
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Feature optimization is important to agricultural text mining. Usually, the vector space model is used to represent text documents. However, this basic approach still suffers from two drawbacks: the curse of dimension and the lack of semantic information. In this paper, a novel ontology-based feature optimization method for agricultural text was proposed. First, terms of vector space model were mapped into concepts of agricultural ontology, which concept frequency weights are computed statistically by term frequency weights; second, weights of concept similarity were assigned to the concept features according to the structure of the agricultural ontology. By combining feature frequency weights and feature similarity weights based on the agricultural ontology, the dimensionality of feature space can be reduced drastically. Moreover, the semantic information can be incorporated into this method. The results showed that this method yields a significant improvement on agricultural text clustering by the feature optimization.
Bibliography:Feature optimization is important to agricultural text mining. Usually, the vector space model is used to represent text documents. However, this basic approach still suffers from two drawbacks: thecurse of dimension and the lack of semantic information. In this paper, a novel ontology-based feature optimization method for agricultural text was proposed. First, terms of vector space model were mapped into concepts of agricultural ontology, which concept frequency weights are computed statistically by term frequency weights; second, weights of concept similarity were assigned to the concept features according to the structure of the agricultural ontology. By combining feature frequency weights and feature similarity weights based on the agricultural ontology, the dimensionality of feature space can be reduced drastically. Moreover, the semantic information can be incorporated into this method. The results showed that this method yields a significant improvement on agricultural text clustering by the feature optimization.
10-1039/S
agricultural ontology, feature optimization, agricultural text clustering
http://dx.doi.org/
http://www.chinaagrisci.com/Jwk_zgnykxen/fileup/PDF/2012.V11(05)-752.pdf
ISSN:2095-3119
2352-3425
DOI:10.1016/S2095-3119(12)60064-1