Header-token driven automatic text segmentation
A method and a system to automatically segment text based on header tokens is described. A relevance value and an irrelevance value are determined for each token in a description, assuming no tokens are left out of computations. The irrelevance value is based on occurrences of a token in a sample se...
Saved in:
Main Authors | , |
---|---|
Format | Patent |
Language | English |
Published |
14.01.2014
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | A method and a system to automatically segment text based on header tokens is described. A relevance value and an irrelevance value are determined for each token in a description, assuming no tokens are left out of computations. The irrelevance value is based on occurrences of a token in a sample set of descriptions. The relevance value is an estimated probability of relevance based on the header of the description being segmented. |
---|---|
Bibliography: | Application Number: US20060646900 |