SciSpace: A scientific collaboration workspace for geo-distributed HPC data centers
Future terabit networks are committed to dramatically improving big data motion between geographically dispersed HPC data centers. The scientific community takes advantage of the terabit networks such as DOE’s ESnet and accelerates the trend to build a small world of collaboration between geospatial...
Saved in:
Published in | Future generation computer systems Vol. 101; pp. 398 - 409 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
01.12.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Future terabit networks are committed to dramatically improving big data motion between geographically dispersed HPC data centers. The scientific community takes advantage of the terabit networks such as DOE’s ESnet and accelerates the trend to build a small world of collaboration between geospatial HPC data centers. It improves information and resource sharing for joint simulation and analysis between the HPC data centers. However, there exist several challenges for effective collaborations such as a collective view of multi-site shared data, minimal performance degradation of scientific applications running in a such collaboration environments and critical of all, data sharing policies in such collaborations. In this paper, we propose to build SciSpace, Scientific Collaboration Workspace for collaborative data centers. It provides a global view of information shared from multiple geo-distributed HPC data centers under a single workspace. SciSpace supports native data-access to gain high-performance when data read or write is required in native data center namespace. It is accomplished by integrating an on-demand metadata export protocol. To optimize scientific collaborations across HPC data centers, SciSpace implements search and discovery service. To evaluate, we configured two geo-distributed small-scale HPC data centers connected via high-speed Infiniband network such as terabits network of DOE’s ESnet, equipped with LustreFS. We show the feasibility of SciSpace using real scientific datasets and applications. The evaluation results show average 36% performance boost when the proposed native-data access is employed in collaborations. We also emulate a real climate science collaboration to validate the usefulness of SciSpace. |
---|---|
ISSN: | 0167-739X 1872-7115 |
DOI: | 10.1016/j.future.2019.06.006 |