MaDaTS: Managing Data on Tiered Storage for Scientific Workflows

Scientific workflows are processing large amounts of data through complex simulation and analysis tasks. Meanwhile, the need to minimize I/O costs on next generation systems and the evolution of new technologies (NVRAMs, SSDs etc.) is resulting in deeper storage hierarchies on High Performance Compu...

Full description

Saved in:
Bibliographic Details
Published inJournal of open source software Vol. 3; no. 30; p. 830
Main Authors Ghoshal, Devarshi, Ramakrishnan, Lavanya
Format Journal Article
LanguageEnglish
Published United States Open Source Initiative - NumFOCUS; Copyright - Open Journals 01.10.2018
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Scientific workflows are processing large amounts of data through complex simulation and analysis tasks. Meanwhile, the need to minimize I/O costs on next generation systems and the evolution of new technologies (NVRAMs, SSDs etc.) is resulting in deeper storage hierarchies on High Performance Computing (HPC) systems. A multi-tiered storage hierarchy introduces complexities in workflow and data management. There is need for simple and flexible data abstractions that can allow users to seamlessly manage workflow data and tasks on HPC systems with multiple storage tiers. MaDaTS (Managing Data on Tiered Storage for Scientific Workflows) provides an API and a command-line tool that allows users to manage their workflows and data on tiered storage (Ghoshal & Ramakrishnan (2017)).
Bibliography:AC02-05CH11231
USDOE Office of Science (SC)
ISSN:2475-9066
2475-9066
DOI:10.21105/joss.00830