DOCUMENT PROCESSING PROGRAM AND DOCUMENT PROCESSOR

PROBLEM TO BE SOLVED: To improve the classifying precision for a document. SOLUTION: A keyword extraction part 311 extracts a keyword based on the appearance frequency of a character string in the document for every document stored in a document storage part 22. An object sentence extraction part 31...

Full description

Saved in:
Bibliographic Details
Main Authors SAITO YOSHIMI, KANO TOSHIYUKI, KURATA SAORI
Format Patent
LanguageEnglish
Published 22.07.2010
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:PROBLEM TO BE SOLVED: To improve the classifying precision for a document. SOLUTION: A keyword extraction part 311 extracts a keyword based on the appearance frequency of a character string in the document for every document stored in a document storage part 22. An object sentence extraction part 312 extracts a summary sentence including the extracted keyword from the document from which the keyword has been extracted. A paraphrastic sentence generation part 322 generates a paraphrastic sentence including the keyword based on the keyword included in the extracted summary sentence and the modification analytic result of the summary sentence. An identity extraction part 42 extracts the set of identifies including the keyword included in the paraphrastic sentence generated by the paraphrastic sentence generation part 322, and stores the set of identifies in an identity storage part 26. A document vector generation part 442 generates a document vector based on a document vector component value showing the appearance frequency of the set of identities stored in the identity storage part 26 in the summary sentence extracted from each document stored in the document storage part 22. COPYRIGHT: (C)2010,JPO&INPIT
Bibliography:Application Number: JP20090001851