SYSTEM AND METHOD FOR CATEGORIZING DOCUMENTS, AND APPARATUS APPLIED TO THE SAME
The present invention discloses a document classification system, a method thereof, and an apparatus for application. In other words, the present invention rearranges the configuration information defined in a web-based unstructured document to generate a structured document, determines whether the...
Saved in:
Main Authors | , , |
---|---|
Format | Patent |
Language | English Korean |
Published |
21.04.2014
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The present invention discloses a document classification system, a method thereof, and an apparatus for application. In other words, the present invention rearranges the configuration information defined in a web-based unstructured document to generate a structured document, determines whether the structured document is related to a category based on the similarity between the structured document and reference documents of the categories set for document classification, and assigning a certain one in the categories based on the inclusion state of keywords set for each category of the related document when the document is determined to be related, in order to efficiently classify and apply unstructured documents to actual services. [Reference numerals] (AA) Start; (BB) End; (S210) Collect an imformal document; (S220) Generate a formal document; (S230) Require to generate a reference document?; (S240) Select and store the reference document through groupping; (S250) Check the reference document for selecting the related document; (S260) Determine the similarity; (S270) Abnormal reference value?; (S280) Determine to the related document; (S290) Category path exists; (S300) Allocate a category by using the category path; (S310) Allocate the category by using the weight and the frequency of keywords; (S320) Residual formal document exists?; (S330) Allocate the category by using the similarity; (S340) Residual formal document exists; (S350) Exclude non-relevant documents |
---|---|
Bibliography: | Application Number: KR20120110625 |