APPARATUS AND METHOD FOR RETRIEVING STRUCTURED DOCUMENTS

An apparatus for retrieving structured documents includes a first categorizing unit configured to categorize components into a first component of typical descriptions and a second component of atypical descriptions, based on statistics information for the components, a second categorizing unit confi...

Full description

Saved in:
Bibliographic Details
Main Authors MANABE TOSHIHIKO, KOKUBU TOMOHARU
Format Patent
LanguageEnglish
Published 28.05.2009
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:An apparatus for retrieving structured documents includes a first categorizing unit configured to categorize components into a first component of typical descriptions and a second component of atypical descriptions, based on statistics information for the components, a second categorizing unit configured to categorize the terms into a first term whose appearance ratio in the first component exceeds a threshold and a second term whose appearance ratio in the first component is not more than the threshold, an extraction unit configured to extract a set of structured documents each having the first component including the first term and the second component from the structured documents, and a ranking unit configured to rank the set of structured documents by a retrieval score calculating based o a relation between the second term and the second component.
Bibliography:Application Number: US20080205636