Using DMoz for constructing ontology from data stream

This paper presents an approach for constructing an ontology from a stream of documents. Named entities extracted from the documents are used as instances of the ontology. Entities and co-occurring entity pairs are represented by feature vectors based on the content of the documents where they occur...

Full description

Saved in:

Bibliographic Details
Published in	28th International Conference on Information Technology Interfaces, 2006 pp. 439 - 444
Main Authors	Grobelnik, M., Brank, J., Mladenic, D., Novak, B., Fortuna, B.
Format	Conference Proceeding
Language	English
Published	IEEE 2006
Subjects	Algorithm design and analysis Companies Data mining Data processing Hardware Ontologies Sensor systems
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper presents an approach for constructing an ontology from a stream of documents. Named entities extracted from the documents are used as instances of the ontology. Entities and co-occurring entity pairs are represented by feature vectors based on the content of the documents where they occurred. In general, concepts and relations can be formed into an ontological structure either by clustering or by classification into an existing topic hierarchy. We propose the latter using DMoz as an existing topic hierarchy. The approach is efficient and can scale to large data sets. We propose a framework that incorporates the stream mining process into a formal definition of the ontology. We describe a software component implementing this approach, and present experiments using a large collection of news
ISBN:	9789537138059 9537138054
ISSN:	1330-1012
DOI:	10.1109/ITI.2006.1708521