DOCUMENT CLASSIFYING DEVICE, METHOD AND PROGRAM

PROBLEM TO BE SOLVED: To provide a document classifying device capable of presenting a classification result of documents in an intelligible manner, and further to provide a method and a program.SOLUTION: A document classifying device of one embodiment extracts feature words from documents contained...

Full description

Saved in:
Bibliographic Details
Main Authors INABA MASUMI, MANABE TOSHIHIKO, NAKANO WATARU, KOKUBU TOMOHARU
Format Patent
LanguageEnglish
Published 11.04.2013
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:PROBLEM TO BE SOLVED: To provide a document classifying device capable of presenting a classification result of documents in an intelligible manner, and further to provide a method and a program.SOLUTION: A document classifying device of one embodiment extracts feature words from documents contained in a document set and clusters the feature words extracted into a plurality of clusters that constitutes sub-trees of a thesaurus having a tree structure and is the plurality of clusters with a difference between a number of documents in which the feature words belonging to one cluster appear and a number of documents in which the feature words belonging to other clusters appear not greater than a reference value set beforehand. Further, the document classifying device classifies the documents contained in the document set into the clusters to which the feature words appearing in the documents belong and individually imparts classification labels that are word phrases representing the feature words belonging to respective clusters to the respective clusters. Furthermore, the document classifying device presents a classification result of the documents by associating the classification result with the classification labels imparted to the clusters classified.
Bibliography:Application Number: JP20110202281