Towards Accelerating Generic Machine Learning Prediction Pipelines

Machine Learning models are often composed by sequences of transformations. While this design makes easy to decompose and accelerate single model components at training time, predictions requires low latency and high performance predictability whereby end-to-end runtime optimizations and acceleratio...

Full description

Saved in:

Bibliographic Details
Published in	2017 IEEE International Conference on Computer Design (ICCD) pp. 431 - 434
Main Authors	Scolari, Alberto, Yunseong Lee, Weimer, Markus, Interlandi, Matteo
Format	Conference Proceeding
Language	English
Published	IEEE 01.11.2017
Subjects	Acceleration Computational modeling Data models Field programmable gate arrays FPGA Machine Learning Model Scoring Pipelines Prediction Pipelines Predictive models Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Machine Learning models are often composed by sequences of transformations. While this design makes easy to decompose and accelerate single model components at training time, predictions requires low latency and high performance predictability whereby end-to-end runtime optimizations and acceleration is needed to meet such goals. This paper shed some light on the problem by using a production-like model, and showing how by redesigning model pipelines for efficient execution over CPUs and FPGAs performance improvements of several folds can be achieved.
ISSN:	1063-6404 2576-6996
DOI:	10.1109/ICCD.2017.76