USING MACHINE LEARNING TO PREDICT BIG DATA ENVIRONMENT PERFORMANCE

A method includes performing operations as follows on a processor: receiving a big data dataset comprising new active data, receiving a request to predict a level of performance with respect to a performance parameter of a data processing system in analyzing the new active data, selecting a machine...

Full description

Saved in:
Bibliographic Details
Main Authors Marimadaiah Sanjai, Dominiak Jacek, Gupta Smrati
Format Patent
LanguageEnglish
Published 18.05.2017
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A method includes performing operations as follows on a processor: receiving a big data dataset comprising new active data, receiving a request to predict a level of performance with respect to a performance parameter of a data processing system in analyzing the new active data, selecting a machine learning algorithm from a plurality of machine learning algorithms based on the performance parameter to obtain a selected machine learning algorithm, selecting a group of historical metadata from a plurality of groups of historical metadata of datasets that have previously been analyzed using the data processing system to provide a selected group of historical metadata, applying the selected machine learning algorithm to the selected group of historical metadata to generate a model of the selected group of historical metadata, obtaining metadata of the new active data, applying the model to the metadata of the new active data to generate a prediction of the level of performance with respect to the performance parameter; and configuring the data processing system for analyzing the new active data based on the prediction.
Bibliography:Application Number: US201514944969