A parallel computing approach to creating engineering concept spaces for semantic retrieval: the Illinois Digital Library Initiative project

This research presents preliminary results generated from the semantic retrieval research component of the Illinois Digital Library Initiative (DLI) project. Using a variation of the automatic thesaurus generation techniques, to which we refer to as the concept space approach, we aimed to create gra...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on pattern analysis and machine intelligence Vol. 18; no. 8; pp. 771 - 782
Main Authors Hsinchun Chen, Schatz, B., Ng, T., Martinez, J., Kirchhoff, A., Chienting Lin
Format Journal Article
LanguageEnglish
Published IEEE 01.08.1996
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This research presents preliminary results generated from the semantic retrieval research component of the Illinois Digital Library Initiative (DLI) project. Using a variation of the automatic thesaurus generation techniques, to which we refer to as the concept space approach, we aimed to create graphs of domain-specific concepts (terms) and their weighted co-occurrence relationships for all major engineering domains. Merging these concept spaces and providing traversal paths across different concept spaces could potentially help alleviate the vocabulary (difference) problem evident in large-scale information retrieval. In order to address the scalability issue related to large-scale information retrieval and analysis for the current Illinois DLI project, we conducted experiments using the concept space approach on parallel supercomputers. Our test collection included computer science and electrical engineering abstracts extracted from the INSPEC database. The concept space approach called for extensive textual and statistical analysis (a form of knowledge discovery) based on automatic indexing and co-occurrence analysis algorithms, both previously tested in the biology domain. Initial testing results using a 512-node CM-5 and a 16-processor SGI Power Challenge were promising.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ISSN:0162-8828
DOI:10.1109/34.531798