GENERATING COMPUTATIONALLY-EFFICIENT REPRESENTATIONS OF LARGE DATASETS

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing large datasets using a computationally-efficient representation are disclosed. A request to apply a coverage algorithm to a large input dataset is received. The large dataset includes sets...

Full description

Saved in:

Bibliographic Details
Main Authors	Mirrokni Banadaki, Seyed Vahab, Bateni, MohammadHossein, Esfandiari, Hossein
Format	Patent
Language	English
Published	24.01.2019
Subjects	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC COMMUNICATION TECHNIQUE ELECTRIC DIGITAL DATA PROCESSING ELECTRICITY PHYSICS TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHICCOMMUNICATION
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing large datasets using a computationally-efficient representation are disclosed. A request to apply a coverage algorithm to a large input dataset is received. The large dataset includes sets of elements. A computationally-efficient representation of the large dataset is generated by generating a reduced set of elements that contains fewer elements based on a defined probability. For each element in the reduced set, a determination is made regarding whether the element appears in more than a threshold number of sets. When the element appears in more than the threshold number, the element is removed from sets until the element appears in only the threshold number. The coverage algorithm is then applied to the computationally-efficient representation to identify a subset of the sets. The system provides data identifying the subset of the sets in response to the received request.
Bibliography:	Application Number: US201816042975