Comprehensive Analytics of Large Data Query Processing on Relational Database with SSDs

Solid-state drives (SSDs) are widely used in large data processing applications due to their higher random access throughput than HDDs and capability of parallel I/O processing. The I/O bottlenecks that HDDs on database systems face can be resolved by using SSDs because of these advantages. However,...

Full description

Saved in:
Bibliographic Details
Published inDatabases Theory and Applications Vol. 8506; pp. 135 - 146
Main Authors Suzuki, Keisuke, Hayamizu, Yuto, Yokoyama, Daisaku, Nakano, Miyuki, Kitsuregawa, Masaru
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 2014
Springer International Publishing
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Solid-state drives (SSDs) are widely used in large data processing applications due to their higher random access throughput than HDDs and capability of parallel I/O processing. The I/O bottlenecks that HDDs on database systems face can be resolved by using SSDs because of these advantages. However, access latency on cache hierarchy may become a new bottleneck in SSD-based databases. In this study, we quantitatively analyzed the behavior of SSD-based databases by taking hashjoin operation. We found that cache misses in SSD-based databases can be decreased by reducing the hashtable size to fit into the cache. This is because the I/O cost is not increased by the high throughput of the SSDs, even though the hashjoin partition files are fragmented. We also observed that cache misses are not increased by taking a multi-hashjoin query. This is because the total size of multiple hashtables can fit into the cache size in SSD-based databases, which is in contrast to HDD-based databases, where hashtables require almost all of the available memory. Overall, our analytics clarify that the performance of multiple queries in SSD-based databases can be improved by considering data access locality of the hashjoin operation and determining the appropriate hashtable size to fit into the cache.
ISBN:3319086073
9783319086071
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-319-08608-8_12