Explaining human performance in psycholinguistic tasks with models of semantic similarity based on prediction and counting: A review and empirical validation

•We compare classical distributional semantics models to a new, prediction-based class of models.•In evaluation we use psycholinguistically relevant tasks, including a semantic priming megastudy.•We find that the new class of model generally provides a better or comparable fit to behavioral data.•We...

Full description

Saved in:

Bibliographic Details
Published in	Journal of memory and language Vol. 92; pp. 57 - 78
Main Authors	Mandera, Paweł, Keuleers, Emmanuel, Brysbaert, Marc
Format	Journal Article
Language	English
Published	New York Elsevier Inc 01.02.2017 Elsevier BV
Subjects	Distributional semantics Dutch language English language Human performance Learning Linguistic Theory Priming Psycholinguistic resource Psycholinguistics Semantic model Semantic priming Semantic processing Semantics Semiotics Sociolinguistics Word meaning Psycholinguistic resource Distributional semantics Semantic priming Semantic model
Online Access	Get full text

Cover

Loading…

More Information
Summary:	•We compare classical distributional semantics models to a new, prediction-based class of models.•In evaluation we use psycholinguistically relevant tasks, including a semantic priming megastudy.•We find that the new class of model generally provides a better or comparable fit to behavioral data.•We release pre-trained semantic spaces for Dutch and English and an open-source interface. Recent developments in distributional semantics (Mikolov, Chen, Corrado, & Dean, 2013; Mikolov, Sutskever, Chen, Corrado, & Dean, 2013) include a new class of prediction-based models that are trained on a text corpus and that measure semantic similarity between words. We discuss the relevance of these models for psycholinguistic theories and compare them to more traditional distributional semantic models. We compare the models’ performances on a large dataset of semantic priming (Hutchison et al., 2013) and on a number of other tasks involving semantic processing and conclude that the prediction-based models usually offer a better fit to behavioral data. Theoretically, we argue that these models bridge the gap between traditional approaches to distributional semantics and psychologically plausible learning principles. As an aid to researchers, we release semantic vectors for English and Dutch for a range of models together with a convenient interface that can be used to extract a great number of semantic similarity measures.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0749-596X 1096-0821
DOI:	10.1016/j.jml.2016.04.001