Integrating memory-mapping and N-dimensional hash function for fast and efficient grid-based climate data query

Database systems are pervasive components in the current big data era. However, efficiently managing and querying grid-based or array-based multidimensional climate data are still beyond the capabilities of most databases. The mismatch between the array data model and relational data model limited t...

Full description

Saved in:
Bibliographic Details
Published inAnnals of GIS Vol. 27; no. 1; pp. 57 - 69
Main Authors Xu, Mengchao, Zhao, Liang, Yang, Ruixin, Yang, Jingchao, Sha, Dexuan, Yang, Chaowei
Format Journal Article
LanguageEnglish
Published Taylor & Francis 02.01.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Database systems are pervasive components in the current big data era. However, efficiently managing and querying grid-based or array-based multidimensional climate data are still beyond the capabilities of most databases. The mismatch between the array data model and relational data model limited the performance to query multidimensional data in a traditional database when data volume hits a cap. Even a trivial data retrieval on large multidimensional datasets in a relational database is time-consuming and requires enormous storage space. Given the scientific interests and application demands on time-sensitive spatiotemporal data query and analysis, there is an urgent need for efficient data storage and fast data retrieval solutions on large multidimensional datasets. In this paper, we introduce a method for multidimensional data storing and accessing, which includes a new hash function algorithm that works on a unified data storage structure and couples with the memory-mapping technology. A prototype database library, LotDB developed as an implementation, is described in this paper, which shows promising results on data query performance compared with SciDB, MongoDB, and PostgreSQL.
ISSN:1947-5683
1947-5691
DOI:10.1080/19475683.2020.1743354