GrayWulf: Scalable Software Architecture for Data Intensive Computing

Big data presents new challenges to both cluster infrastructure software and parallel application design. We present a set of software services and design principles for data intensive computing with petabyte data sets, named GrayWulf. These services are intended for deployment on a cluster of commo...

Full description

Saved in:

Bibliographic Details
Published in	2009 42nd Hawaii International Conference on System Sciences pp. 1 - 10
Main Authors	Simmhan, Y., Barga, R., van Ingen, C., Nieto-Santisteban, M., Dobos, L., Li, N., Shipway, M., Szalay, A.S., Werner, S., Heasley, J.
Format	Conference Proceeding
Language	English
Published	IEEE 01.01.2009
Subjects	Software architecture
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Big data presents new challenges to both cluster infrastructure software and parallel application design. We present a set of software services and design principles for data intensive computing with petabyte data sets, named GrayWulf. These services are intended for deployment on a cluster of commodity servers similar to the well-known Beowulf clusters. We use the Pan-STARRS system currently under development as an example of the architecture and principles in action.
ISBN:	9780769534503 0769534503
ISSN:	1530-1605 2572-6862
DOI:	10.1109/HICSS.2009.235