Method for identifying sub-sequences of interest in a sequence
The present technique provides for the analysis of a data series to identify sequences of interest within the series. The analysis may be used to iteratively update a grammar used to analyze the data series or updated versions of the data series. Furthermore, the technique provides for the calculati...
Saved in:
Main Authors | , , |
---|---|
Format | Patent |
Language | English |
Published |
25.10.2011
|
Online Access | Get full text |
Cover
Loading…
Summary: | The present technique provides for the analysis of a data series to identify sequences of interest within the series. The analysis may be used to iteratively update a grammar used to analyze the data series or updated versions of the data series. Furthermore, the technique provides for the calculation of a minimum description length heuristic, such as a symbol compression ratio, for each sub-sequence of the analyzed data sequence. The technique may then compare a selected heuristic value against one or more reference conditions to determine if additional iteration is to be performed. The grammar and the data sequence may be updated between iterations to include a symbol representing a string corresponding to the selected heuristic value based upon a non-termination result of the comparison. Alternatively, the string corresponding to the selected heuristic value may be identified as a sequence of interest based upon a termination result of the comparison. |
---|