Do LLMs Dream of Ontologies?

Large language models (LLMs) have recently revolutionized automated text understanding and generation. The performance of these models relies on the high number of parameters of the underlying neural architectures, which allows LLMs to memorize part of the vast quantity of data seen during the train...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Bombieri, Marco, Fiorini, Paolo, Ponzetto, Simone Paolo, Rospocher, Marco
Format	Paper
Language	English
Published	Ithaca Cornell University Library, arXiv.org 26.01.2024
Subjects	Large language models Query languages Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Large language models (LLMs) have recently revolutionized automated text understanding and generation. The performance of these models relies on the high number of parameters of the underlying neural architectures, which allows LLMs to memorize part of the vast quantity of data seen during the training. This paper investigates whether and to what extent general-purpose pre-trained LLMs have memorized information from known ontologies. Our results show that LLMs partially know ontologies: they can, and do indeed, memorize concepts from ontologies mentioned in the text, but the level of memorization of their concepts seems to vary proportionally to their popularity on the Web, the primary source of their training material. We additionally propose new metrics to estimate the degree of memorization of ontological information in LLMs by measuring the consistency of the output produced across different prompt repetitions, query languages, and degrees of determinism.
ISSN:	2331-8422