In-situ visual exploration over big raw data

Data exploration and visual analytics systems are of great importance in Open Science scenarios, where less tech-savvy researchers wish to access and visually explore big raw data files (e.g., json, csv) generated by scientific experiments using commodity hardware and without being overwhelmed in th...

Full description

Saved in:
Bibliographic Details
Published inInformation systems (Oxford) Vol. 95; p. 101616
Main Authors Bikakis, Nikos, Maroulis, Stavros, Papastefanatos, George, Vassiliadis, Panos
Format Journal Article
LanguageEnglish
Published Oxford Elsevier Ltd 01.01.2021
Elsevier Science Ltd
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Data exploration and visual analytics systems are of great importance in Open Science scenarios, where less tech-savvy researchers wish to access and visually explore big raw data files (e.g., json, csv) generated by scientific experiments using commodity hardware and without being overwhelmed in the tedious processes of data loading, indexing and query optimization. In this paper, we present our work for enabling efficient query processing on large raw data files for interactive visual exploration scenarios and analytics. We introduce a framework, named RawVis, built on top of a lightweight in-memory tile-based index, VALINOR, that is constructed on-the-fly given the first user query over a raw file and progressively adapted based on the user interaction. We evaluate the performance of a prototype implementation compared to three other alternatives and show that our method outperforms in terms of response time, disk accesses and memory consumption. Particularly during an exploration scenario, the proposed method in most cases is about 5-10× faster compared to existing solutions, and requires significantly less memory resources. •Progressive and Adaptive processing for in-situ visualization and analytics.•Visual user interactions over raw data as data-access operations.•A main-memory index, constructed on-the-fly based on the first user interaction.•User-driven techniques that progressively adapt index structure during exploration.•Improvement in terms of execution time, I/O operations, and memory consumption.
ISSN:0306-4379
1873-6076
DOI:10.1016/j.is.2020.101616