Distributed Networked Real-time Learning

Many machine learning algorithms have been developed under the assumption that data sets are already available in batch form. Yet in many application domains data is only available sequentially overtime via compute nodes in different geographic locations. In this paper, we consider the problem of le...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Garcia, Alfredo, Wang, Luochao, Huang, Jeff, Hong, Lingzhou
Format Paper
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 09.09.2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Many machine learning algorithms have been developed under the assumption that data sets are already available in batch form. Yet in many application domains data is only available sequentially overtime via compute nodes in different geographic locations. In this paper, we consider the problem of learning a model when streaming data cannot be transferred to a single location in a timely fashion. In such cases, a distributed architecture for learning relying on a network of interconnected "local" nodes is required. We propose a distributed scheme in which every local node implements stochastic gradient updates based upon a local data stream. To ensure robust estimation, a network regularization penalty is used to maintain a measure of cohesion in the ensemble of models. We show the ensemble average approximates a stationary point and characterize the degree to which individual models differ from the ensemble average. We compare the results with federated learning to conclude the proposed approach is more robust to heterogeneity in data streams (data rates and estimation quality). We illustrate the results with an application to image classification with a deep learning model based upon convolutional neural networks.
ISSN:2331-8422