Distributed Management of Massive Data: An Efficient Fine-Grain Data Access Scheme
This paper addresses the problem of efficiently storing and accessing massive data blocks in a large-scale distributed environment, while providing efficient fine-grain access to data subsets. This issue is crucial in the context of applications in the field of databases, data mining and multimedia....
Saved in:
Published in | High Performance Computing for Computational Science - VECPAR 2008 pp. 532 - 543 |
---|---|
Main Authors | , , |
Format | Book Chapter |
Language | English |
Published |
Berlin, Heidelberg
Springer Berlin Heidelberg
2008
|
Series | Lecture Notes in Computer Science |
Subjects | |
Online Access | Get full text |
ISBN | 3540928588 9783540928584 |
ISSN | 0302-9743 1611-3349 |
DOI | 10.1007/978-3-540-92859-1_47 |
Cover
Summary: | This paper addresses the problem of efficiently storing and accessing massive data blocks in a large-scale distributed environment, while providing efficient fine-grain access to data subsets. This issue is crucial in the context of applications in the field of databases, data mining and multimedia. We propose a data sharing service based on distributed, RAM-based storage of data, while leveraging a DHT-based, natively parallel metadata management scheme. As opposed to the most commonly used grid storage infrastructures that provide mechanisms for explicit data localization and transfer, we provide a transparent access model, where data are accessed through global identifiers. Our proposal has been validated through a prototype implementation whose preliminary evaluation on the Grid’5000 testbed provides promising results. |
---|---|
ISBN: | 3540928588 9783540928584 |
ISSN: | 0302-9743 1611-3349 |
DOI: | 10.1007/978-3-540-92859-1_47 |