APPARATUS AND METHOD FOR RETRIEVING STRUCTURED DOCUMENTS
An apparatus for retrieving structured documents includes a first categorizing unit configured to categorize components into a first component of typical descriptions and a second component of atypical descriptions, based on statistics information for the components, a second categorizing unit confi...
Saved in:
Main Authors | , |
---|---|
Format | Patent |
Language | English |
Published |
28.05.2009
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | An apparatus for retrieving structured documents includes a first categorizing unit configured to categorize components into a first component of typical descriptions and a second component of atypical descriptions, based on statistics information for the components, a second categorizing unit configured to categorize the terms into a first term whose appearance ratio in the first component exceeds a threshold and a second term whose appearance ratio in the first component is not more than the threshold, an extraction unit configured to extract a set of structured documents each having the first component including the first term and the second component from the structured documents, and a ranking unit configured to rank the set of structured documents by a retrieval score calculating based o a relation between the second term and the second component. |
---|---|
Bibliography: | Application Number: US20080205636 |