Single-pass low-storage arbitrary probabilistic location estimation for massive data sets

The present invention includes a method and system for providing an estimate of a summary of a data set generated by an unknown distribution. The method includes selecting a subset of data points from the data set, applying a scoring rule to each data point of the subset of data points based on an e...

Full description

Saved in:
Bibliographic Details
Main Authors Liechty, John C, McDermott, James P, Lin, Dennis K.J
Format Patent
LanguageEnglish
Published 11.07.2006
Online AccessGet full text

Cover

Loading…
More Information
Summary:The present invention includes a method and system for providing an estimate of a summary of a data set generated by an unknown distribution. The method includes selecting a subset of data points from the data set, applying a scoring rule to each data point of the subset of data points based on an estimated relative location and an assigned weight for each data point to provide a score for each data point, selectively retaining data points to track based on the score for each data point; and determining an estimate of the summary of the data set based on the retained data points.