Header-token driven automatic text segmentation

A method and a system to automatically segment text based on header tokens is described. A relevance value and an irrelevance value are determined for each token in a description, assuming no tokens are left out of computations. The irrelevance value is based on occurrences of a token in a sample se...

Full description

Saved in:
Bibliographic Details
Main Authors SARWAR BADRUL M, MOUNT JOHN A
Format Patent
LanguageEnglish
Published 14.01.2014
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A method and a system to automatically segment text based on header tokens is described. A relevance value and an irrelevance value are determined for each token in a description, assuming no tokens are left out of computations. The irrelevance value is based on occurrences of a token in a sample set of descriptions. The relevance value is an estimated probability of relevance based on the header of the description being segmented.
Bibliography:Application Number: US20060646900