Do large language models resemble humans in language use?
Large language models (LLMs) such as ChatGPT and Vicuna have shown remarkable capacities in comprehending and producing language. However, their internal workings remain a black box, and it is unclear whether LLMs and chatbots can develop humanlike characteristics in language use. Cognitive scientis...
Saved in:
Main Authors | , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
10.03.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Large language models (LLMs) such as ChatGPT and Vicuna have shown remarkable
capacities in comprehending and producing language. However, their internal
workings remain a black box, and it is unclear whether LLMs and chatbots can
develop humanlike characteristics in language use. Cognitive scientists have
devised many experiments that probe, and have made great progress in
explaining, how people comprehend and produce language. We subjected ChatGPT
and Vicuna to 12 of these experiments ranging from sounds to dialogue,
preregistered and with 1000 runs (i.e., iterations) per experiment. ChatGPT and
Vicuna replicated the human pattern of language use in 10 and 7 out of the 12
experiments, respectively. The models associated unfamiliar words with
different meanings depending on their forms, continued to access recently
encountered meanings of ambiguous words, reused recent sentence structures,
attributed causality as a function of verb semantics, and accessed different
meanings and retrieved different words depending on an interlocutor's identity.
In addition, ChatGPT, but not Vicuna, nonliterally interpreted implausible
sentences that were likely to have been corrupted by noise, drew reasonable
inferences, and overlooked semantic fallacies in a sentence. Finally, unlike
humans, neither model preferred using shorter words to convey less informative
content, nor did they use context to resolve syntactic ambiguities. We discuss
how these convergences and divergences may result from the transformer
architecture. Overall, these experiments demonstrate that LLMs such as ChatGPT
(and Vicuna to a lesser extent) are humanlike in many aspects of human language
processing. |
---|---|
DOI: | 10.48550/arxiv.2303.08014 |