GENERATING COMPUTATIONALLY-EFFICIENT REPRESENTATIONS OF LARGE DATASETS
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing large datasets using a computationally-efficient representation are disclosed. A request to apply a coverage algorithm to a large input dataset is received. The large dataset includes sets...
Saved in:
Main Authors | , , |
---|---|
Format | Patent |
Language | English |
Published |
24.01.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing large datasets using a computationally-efficient representation are disclosed. A request to apply a coverage algorithm to a large input dataset is received. The large dataset includes sets of elements. A computationally-efficient representation of the large dataset is generated by generating a reduced set of elements that contains fewer elements based on a defined probability. For each element in the reduced set, a determination is made regarding whether the element appears in more than a threshold number of sets. When the element appears in more than the threshold number, the element is removed from sets until the element appears in only the threshold number. The coverage algorithm is then applied to the computationally-efficient representation to identify a subset of the sets. The system provides data identifying the subset of the sets in response to the received request. |
---|---|
Bibliography: | Application Number: US201816042975 |