MLaKE: Multilingual Knowledge Editing Benchmark for Large Language Models
The extensive utilization of large language models (LLMs) underscores the crucial necessity for precise and contemporary knowledge embedded within their intrinsic parameters. Existing research on knowledge editing primarily concentrates on monolingual scenarios, neglecting the complexities presented...
Saved in:
Main Authors | , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
07.04.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The extensive utilization of large language models (LLMs) underscores the
crucial necessity for precise and contemporary knowledge embedded within their
intrinsic parameters. Existing research on knowledge editing primarily
concentrates on monolingual scenarios, neglecting the complexities presented by
multilingual contexts and multi-hop reasoning. To address these challenges, our
study introduces MLaKE (Multilingual Language Knowledge Editing), a novel
benchmark comprising 4072 multi-hop and 5360 single-hop questions designed to
evaluate the adaptability of knowledge editing methods across five languages:
English, Chinese, Japanese, French, and German. MLaKE aggregates fact chains
from Wikipedia across languages and utilizes LLMs to generate questions in both
free-form and multiple-choice. We evaluate the multilingual knowledge editing
generalization capabilities of existing methods on MLaKE. Existing knowledge
editing methods demonstrate higher success rates in English samples compared to
other languages. However, their generalization capabilities are limited in
multi-language experiments. Notably, existing knowledge editing methods often
show relatively high generalization for languages within the same language
family compared to languages from different language families. These results
underscore the imperative need for advancements in multilingual knowledge
editing and we hope MLaKE can serve as a valuable resource for benchmarking and
solution development. |
---|---|
DOI: | 10.48550/arxiv.2404.04990 |