Chat2Query: A Zero-Shot Automatic Exploratory Data Analysis System with Large Language Models

Data analysts often encounter two primary challenges while conducting exploratory data analysis by SQL: (1) the need to skillfully craft SQL queries, and (2) the requirement to generate suitable visualizations that enhance the interpretation of query results. The emergence of large language models (...

Full description

Saved in:
Bibliographic Details
Published in2024 IEEE 40th International Conference on Data Engineering (ICDE) pp. 5429 - 5432
Main Authors Zhu, Jun-Peng, Cai, Peng, Niu, Boyan, Ni, Zheming, Xu, Kai, Huang, Jiajun, Wan, Jianwei, Ma, Shengbo, Wang, Bing, Zhang, Donghui, Tang, Liu, Liu, Qi
Format Conference Proceeding
LanguageEnglish
Published IEEE 13.05.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Data analysts often encounter two primary challenges while conducting exploratory data analysis by SQL: (1) the need to skillfully craft SQL queries, and (2) the requirement to generate suitable visualizations that enhance the interpretation of query results. The emergence of large language models (LLMs) has inaugurated a paradigm shift in text-to-SQL and data-to-chart. This paper presents Chat2Query, an LLM -empowered zero-shot automatic exploration data analysis system. Firstly, Chat2Query provides a user-friendly interface that allows users to employ natural languages to interact with the database directly. Secondly, Chat2Query offers an LLM -empowered text-to-SQL generator, SQL rewriter, SQL formatter, and data-to-chart generator. Thirdly, Chat2Query is uniquely distinguished by its underlying incorporation of the TiDB Serverless, fostering superior elasticity and scalability. This strategic integration empowers Chat2Query with the capability to seamlessly adapt to change workloads, aligning with the evolving demands of the user. We have implemented and deployed Chat2Query in the production environment, and demonstrate its usability and efficiency in three representative real-world scenarios.
ISSN:2375-026X
DOI:10.1109/ICDE60146.2024.00420