DOCUMENT PROCESSING PROGRAM AND DOCUMENT PROCESSOR
PROBLEM TO BE SOLVED: To improve the classifying precision for a document. SOLUTION: A keyword extraction part 311 extracts a keyword based on the appearance frequency of a character string in the document for every document stored in a document storage part 22. An object sentence extraction part 31...
Saved in:
Main Authors | , , |
---|---|
Format | Patent |
Language | English |
Published |
22.07.2010
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | PROBLEM TO BE SOLVED: To improve the classifying precision for a document. SOLUTION: A keyword extraction part 311 extracts a keyword based on the appearance frequency of a character string in the document for every document stored in a document storage part 22. An object sentence extraction part 312 extracts a summary sentence including the extracted keyword from the document from which the keyword has been extracted. A paraphrastic sentence generation part 322 generates a paraphrastic sentence including the keyword based on the keyword included in the extracted summary sentence and the modification analytic result of the summary sentence. An identity extraction part 42 extracts the set of identifies including the keyword included in the paraphrastic sentence generated by the paraphrastic sentence generation part 322, and stores the set of identifies in an identity storage part 26. A document vector generation part 442 generates a document vector based on a document vector component value showing the appearance frequency of the set of identities stored in the identity storage part 26 in the summary sentence extracted from each document stored in the document storage part 22. COPYRIGHT: (C)2010,JPO&INPIT |
---|---|
Bibliography: | Application Number: JP20090001851 |