Can GPT embeddings enhance visual exploration of literature datasets? A case study on isostatic pressing research

Visual exploration of literature datasets, especially in specialized domains like isostatic pressing in materials research, aids scientific understanding and discovery but demands robust natural language processing techniques for semantic representation. Existing methods often rely on complex and ti...

Full description

Saved in:
Bibliographic Details
Published inJournal of visualization Vol. 27; no. 6; pp. 1213 - 1226
Main Authors Lv, Hongjiang, Niu, Zhibin, Han, Wei, Li, Xiang
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.12.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Visual exploration of literature datasets, especially in specialized domains like isostatic pressing in materials research, aids scientific understanding and discovery but demands robust natural language processing techniques for semantic representation. Existing methods often rely on complex and time-consuming processes to obtain text embeddings, which are numerical representations of text that capture their semantic information and similarity. The quality of text embeddings is crucial for enabling visual exploration of literature datasets. Our research question is whether visual exploration of literature datasets can benefit from GPT (generative pre-trained transformer) text embeddings. We seek to answer this question by performing case studies and expert interviews. To do this, we curated a unique literature dataset about isostatic pressing, sourced from diverse periods and genres. Utilizing a GPT embedding model, we generated embeddings for textual analysis, visualizing and examining their semantic interrelations. Expert reviews were undertaken to evaluate the utility of these techniques. Our findings show that GPT text embeddings offer significant improvements in visually exploring literature datasets, revealing deep semantic similarities and diversities. We also discuss the implications, limitations of our study, and propose directions for future research. Graphical abstract
ISSN:1343-8875
1875-8975
DOI:10.1007/s12650-024-01010-z