INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge

The performance differential of large language models (LLM) between languages hinders their effective deployment in many regions, inhibiting the potential economic and societal value of generative AI tools in many communities. However, the development of functional LLMs in many languages (\ie, multi...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Romanou, Angelika, outan, Negar, Sotnikova, Anna, Chen, Zeming, Nelaturu, Sree Harsha, Singh, Shivalika, Maheshwary, Rishabh, Altomare, Micol, Haggag, Mohamed A, Snegha, A, Amayuelas, Alfonso, Azril Hafizi Amirudin, Aryabumi, Viraat, Boiko, Danylo, Chang, Michael, Chim, Jenny, Cohen, Gal, Dalmia, Aditya Kumar, Diress, Abraham, Duwal, Sharad, Dzenhaliou, Daniil, Daniel Fernando Erazo Florez, Farestam, Fabian, Imperial, Joseph Marvin, Shayekh Bin Islam, Isotalo, Perttu, Jabbarishiviari, Maral, Karlsson, Börje F, Khalilov, Eldar, Klamm, Christopher, Koto, Fajri, Krzemiński, Dominik, de Melo, Gabriel Adriano, Montariol, Syrielle, Yiyang Nan, Niklaus, Joel, Novikova, Jekaterina, Johan Samir Obando Ceron, Debjit, Paul, Ploeger, Esther, Purbey, Jebish, Rajwal, Swati, Selvan Sunitha Ravi, Rydell, Sara, Roshan Santhosh, Sharma, Drishti, Skenduli, Marjana Prifti, Arshia Soltani Moakhar, Bardia Soltani Moakhar, Tamir, Ran, Tarun, Ayush Kumar, Wasi, Azmine Toushik, Weerasinghe, Thenuka Ovin, Yilmaz, Serhan, Zhang, Mike, Schlag, Imanol, Fadaee, Marzieh, Hooker, Sara, Bosselut, Antoine
Format Paper
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 29.11.2024
Subjects
Online AccessGet full text

Cover

Loading…
Abstract The performance differential of large language models (LLM) between languages hinders their effective deployment in many regions, inhibiting the potential economic and societal value of generative AI tools in many communities. However, the development of functional LLMs in many languages (\ie, multilingual LLMs) is bottlenecked by the lack of high-quality evaluation resources in languages other than English. Moreover, current practices in multilingual benchmark construction often translate English resources, ignoring the regional and cultural knowledge of the environments in which multilingual systems would be used. In this work, we construct an evaluation suite of 197,243 QA pairs from local exam sources to measure the capabilities of multilingual LLMs in a variety of regional contexts. Our novel resource, INCLUDE, is a comprehensive knowledge- and reasoning-centric benchmark across 44 written languages that evaluates multilingual LLMs for performance in the actual language environments where they would be deployed.
AbstractList The performance differential of large language models (LLM) between languages hinders their effective deployment in many regions, inhibiting the potential economic and societal value of generative AI tools in many communities. However, the development of functional LLMs in many languages (\ie, multilingual LLMs) is bottlenecked by the lack of high-quality evaluation resources in languages other than English. Moreover, current practices in multilingual benchmark construction often translate English resources, ignoring the regional and cultural knowledge of the environments in which multilingual systems would be used. In this work, we construct an evaluation suite of 197,243 QA pairs from local exam sources to measure the capabilities of multilingual LLMs in a variety of regional contexts. Our novel resource, INCLUDE, is a comprehensive knowledge- and reasoning-centric benchmark across 44 written languages that evaluates multilingual LLMs for performance in the actual language environments where they would be deployed.
Author Daniel Fernando Erazo Florez
Dalmia, Aditya Kumar
Snegha, A
Duwal, Sharad
Diress, Abraham
Yilmaz, Serhan
Chim, Jenny
Boiko, Danylo
Chen, Zeming
Nelaturu, Sree Harsha
Chang, Michael
Purbey, Jebish
Sharma, Drishti
Koto, Fajri
Selvan Sunitha Ravi
Karlsson, Börje F
Altomare, Micol
Hooker, Sara
Shayekh Bin Islam
de Melo, Gabriel Adriano
Weerasinghe, Thenuka Ovin
Klamm, Christopher
Rajwal, Swati
Tarun, Ayush Kumar
Cohen, Gal
Haggag, Mohamed A
Amayuelas, Alfonso
Montariol, Syrielle
Ploeger, Esther
Farestam, Fabian
Fadaee, Marzieh
Schlag, Imanol
Maheshwary, Rishabh
Singh, Shivalika
Niklaus, Joel
Sotnikova, Anna
Imperial, Joseph Marvin
Tamir, Ran
Novikova, Jekaterina
Arshia Soltani Moakhar
Zhang, Mike
Dzenhaliou, Daniil
Bardia Soltani Moakhar
Jabbarishiviari, Maral
Bosselut, Antoine
Khalilov, Eldar
Skenduli, Marjana Prifti
Krzemiński, Dominik
Azril Hafizi Amirudin
Isotalo, Perttu
Roshan Santhosh
Wasi, Azmine Toushik
Romanou, Angelika
Rydell, Sara
outan, Negar
Yiyang Nan
Johan Samir Obando Ceron
Debjit, Paul
Aryabumi, Viraat
Author_xml – sequence: 1
  givenname: Angelika
  surname: Romanou
  fullname: Romanou, Angelika
– sequence: 2
  givenname: Negar
  surname: outan
  fullname: outan, Negar
– sequence: 3
  givenname: Anna
  surname: Sotnikova
  fullname: Sotnikova, Anna
– sequence: 4
  givenname: Zeming
  surname: Chen
  fullname: Chen, Zeming
– sequence: 5
  givenname: Sree
  surname: Nelaturu
  middlename: Harsha
  fullname: Nelaturu, Sree Harsha
– sequence: 6
  givenname: Shivalika
  surname: Singh
  fullname: Singh, Shivalika
– sequence: 7
  givenname: Rishabh
  surname: Maheshwary
  fullname: Maheshwary, Rishabh
– sequence: 8
  givenname: Micol
  surname: Altomare
  fullname: Altomare, Micol
– sequence: 9
  givenname: Mohamed
  surname: Haggag
  middlename: A
  fullname: Haggag, Mohamed A
– sequence: 10
  givenname: A
  surname: Snegha
  fullname: Snegha, A
– sequence: 11
  givenname: Alfonso
  surname: Amayuelas
  fullname: Amayuelas, Alfonso
– sequence: 12
  fullname: Azril Hafizi Amirudin
– sequence: 13
  givenname: Viraat
  surname: Aryabumi
  fullname: Aryabumi, Viraat
– sequence: 14
  givenname: Danylo
  surname: Boiko
  fullname: Boiko, Danylo
– sequence: 15
  givenname: Michael
  surname: Chang
  fullname: Chang, Michael
– sequence: 16
  givenname: Jenny
  surname: Chim
  fullname: Chim, Jenny
– sequence: 17
  givenname: Gal
  surname: Cohen
  fullname: Cohen, Gal
– sequence: 18
  givenname: Aditya
  surname: Dalmia
  middlename: Kumar
  fullname: Dalmia, Aditya Kumar
– sequence: 19
  givenname: Abraham
  surname: Diress
  fullname: Diress, Abraham
– sequence: 20
  givenname: Sharad
  surname: Duwal
  fullname: Duwal, Sharad
– sequence: 21
  givenname: Daniil
  surname: Dzenhaliou
  fullname: Dzenhaliou, Daniil
– sequence: 22
  fullname: Daniel Fernando Erazo Florez
– sequence: 23
  givenname: Fabian
  surname: Farestam
  fullname: Farestam, Fabian
– sequence: 24
  givenname: Joseph
  surname: Imperial
  middlename: Marvin
  fullname: Imperial, Joseph Marvin
– sequence: 25
  fullname: Shayekh Bin Islam
– sequence: 26
  givenname: Perttu
  surname: Isotalo
  fullname: Isotalo, Perttu
– sequence: 27
  givenname: Maral
  surname: Jabbarishiviari
  fullname: Jabbarishiviari, Maral
– sequence: 28
  givenname: Börje
  surname: Karlsson
  middlename: F
  fullname: Karlsson, Börje F
– sequence: 29
  givenname: Eldar
  surname: Khalilov
  fullname: Khalilov, Eldar
– sequence: 30
  givenname: Christopher
  surname: Klamm
  fullname: Klamm, Christopher
– sequence: 31
  givenname: Fajri
  surname: Koto
  fullname: Koto, Fajri
– sequence: 32
  givenname: Dominik
  surname: Krzemiński
  fullname: Krzemiński, Dominik
– sequence: 33
  givenname: Gabriel
  surname: de Melo
  middlename: Adriano
  fullname: de Melo, Gabriel Adriano
– sequence: 34
  givenname: Syrielle
  surname: Montariol
  fullname: Montariol, Syrielle
– sequence: 35
  fullname: Yiyang Nan
– sequence: 36
  givenname: Joel
  surname: Niklaus
  fullname: Niklaus, Joel
– sequence: 37
  givenname: Jekaterina
  surname: Novikova
  fullname: Novikova, Jekaterina
– sequence: 38
  fullname: Johan Samir Obando Ceron
– sequence: 39
  givenname: Paul
  surname: Debjit
  fullname: Debjit, Paul
– sequence: 40
  givenname: Esther
  surname: Ploeger
  fullname: Ploeger, Esther
– sequence: 41
  givenname: Jebish
  surname: Purbey
  fullname: Purbey, Jebish
– sequence: 42
  givenname: Swati
  surname: Rajwal
  fullname: Rajwal, Swati
– sequence: 43
  fullname: Selvan Sunitha Ravi
– sequence: 44
  givenname: Sara
  surname: Rydell
  fullname: Rydell, Sara
– sequence: 45
  fullname: Roshan Santhosh
– sequence: 46
  givenname: Drishti
  surname: Sharma
  fullname: Sharma, Drishti
– sequence: 47
  givenname: Marjana
  surname: Skenduli
  middlename: Prifti
  fullname: Skenduli, Marjana Prifti
– sequence: 48
  fullname: Arshia Soltani Moakhar
– sequence: 49
  fullname: Bardia Soltani Moakhar
– sequence: 50
  givenname: Ran
  surname: Tamir
  fullname: Tamir, Ran
– sequence: 51
  givenname: Ayush
  surname: Tarun
  middlename: Kumar
  fullname: Tarun, Ayush Kumar
– sequence: 52
  givenname: Azmine
  surname: Wasi
  middlename: Toushik
  fullname: Wasi, Azmine Toushik
– sequence: 53
  givenname: Thenuka
  surname: Weerasinghe
  middlename: Ovin
  fullname: Weerasinghe, Thenuka Ovin
– sequence: 54
  givenname: Serhan
  surname: Yilmaz
  fullname: Yilmaz, Serhan
– sequence: 55
  givenname: Mike
  surname: Zhang
  fullname: Zhang, Mike
– sequence: 56
  givenname: Imanol
  surname: Schlag
  fullname: Schlag, Imanol
– sequence: 57
  givenname: Marzieh
  surname: Fadaee
  fullname: Fadaee, Marzieh
– sequence: 58
  givenname: Sara
  surname: Hooker
  fullname: Hooker, Sara
– sequence: 59
  givenname: Antoine
  surname: Bosselut
  fullname: Bosselut, Antoine
BookMark eNqNy9EKgjAYhuERBVl5D4OOhblpZadmFFlB5LEM_FvK2MptefspdAEdvd_B883QWGkFI-RRxsJgE1E6Rb4xDSGErtY0jpmHrsdLmhe7bIuzD5eO21oJfHbS1rJfjkuc86ECcKEqaI3lqhpMV9snvoGoterRSelOQiVggSYPLg34v87Rcp_d00PwavXbgbFlo13bX0zJQhYlCSU0ZP-pLz0ZQAI
ContentType Paper
Copyright 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID 8FE
8FG
ABJCF
ABUWG
AFKRA
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
HCIFZ
L6V
M7S
PIMPY
PQEST
PQQKQ
PQUKI
PRINS
PTHSS
DatabaseName ProQuest SciTech Collection
ProQuest Technology Collection
Materials Science & Engineering Collection
ProQuest Central (Alumni)
ProQuest Central
ProQuest Central Essentials
ProQuest Central
Technology Collection
ProQuest One Community College
ProQuest Central Korea
SciTech Premium Collection
ProQuest Engineering Collection
Engineering Database
Publicly Available Content Database
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
Engineering Collection
DatabaseTitle Publicly Available Content Database
Engineering Database
Technology Collection
ProQuest Central Essentials
ProQuest One Academic Eastern Edition
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Technology Collection
ProQuest SciTech Collection
ProQuest Central China
ProQuest Central
ProQuest Engineering Collection
ProQuest One Academic UKI Edition
ProQuest Central Korea
Materials Science & Engineering Collection
ProQuest One Academic
Engineering Collection
DatabaseTitleList Publicly Available Content Database
Database_xml – sequence: 1
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Physics
EISSN 2331-8422
Genre Working Paper/Pre-Print
GroupedDBID 8FE
8FG
ABJCF
ABUWG
AFKRA
ALMA_UNASSIGNED_HOLDINGS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
FRJ
HCIFZ
L6V
M7S
M~E
PIMPY
PQEST
PQQKQ
PQUKI
PRINS
PTHSS
ID FETCH-proquest_journals_31349920213
IEDL.DBID 8FG
IngestDate Thu Dec 05 10:27:51 EST 2024
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-proquest_journals_31349920213
OpenAccessLink https://www.proquest.com/docview/3134992021?pq-origsite=%requestingapplication%
PQID 3134992021
PQPubID 2050157
ParticipantIDs proquest_journals_3134992021
PublicationCentury 2000
PublicationDate 20241129
PublicationDateYYYYMMDD 2024-11-29
PublicationDate_xml – month: 11
  year: 2024
  text: 20241129
  day: 29
PublicationDecade 2020
PublicationPlace Ithaca
PublicationPlace_xml – name: Ithaca
PublicationTitle arXiv.org
PublicationYear 2024
Publisher Cornell University Library, arXiv.org
Publisher_xml – sequence: 0
  name: Cornell University Library, arXiv.org
SSID ssj0002672553
Score 3.5795262
SecondaryResourceType preprint
Snippet The performance differential of large language models (LLM) between languages hinders their effective deployment in many regions, inhibiting the potential...
SourceID proquest
SourceType Aggregation Database
SubjectTerms Benchmarks
English language
Generative artificial intelligence
Large language models
Non-English languages
Performance evaluation
Quality assessment
Regional development
Title INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge
URI https://www.proquest.com/docview/3134992021
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3fS8MwED50RfDNn_hjjoC-BtemzTpfBLV16DbHsLC3kbbJXsac6_bq3-5dSFUQ9hRCICQhfJf7ct8dwE2IJlGGRcRjFRgeysggDmrFRaRVTuqZUpJ2eDCUvSx8mUQTR7hVLqyyxkQL1OVHQRz5raA8el101f375SenqlH0u-pKaOyC5wcdSSF9cfr8w7EEsoMvZvEPZq3tSA_AG6mlXh3Cjl4cwZ4NuSyqY3hDf7qfPSV3LHEZtxczZvWwpBDfqDnrOy6RZX8VKIyoUzbWM0visdeaFDuB6zR5f-zxehVTd0-q6e-uxCk00OHXZ8CELk1M-TnbhUbvwMR5W-MrKfCN8kVk4nNobpvpYvvwJexjE5KeLug2obFebfQVGtZ13rKn1wLvIRmOxtgbfCXfvsmC-A
link.rule.ids 780,784,12765,21388,33373,33744,43600,43805
linkProvider ProQuest
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1bS8MwFD5oi-ibV7xMDehrcG2a2vkiqB3VdXWMFfZW0jbZi8y5bv_fnJCqIOw1gZCEcC5fzvcdgNtAu8QwqDiNhK9oEHKl7aAUlHEpSmTP1CFyh4dZmOTB25RPLeDW2LLK1iYaQ11_VoiR3zHU0evpVN17XHxR7BqFv6u2hcY2uKiczh1wn-JsNP5BWfzwXsfM7J-hNd6jvw_uSCzk8gC25PwQdkzRZdUcwbvOqNP8JX4gsdXcns-IYcQiR3wtPkhq0USS_-WgEARPyVjODIxHBi0sdgw3_XjynNB2F4V9KU3xey52Ao5O-eUpECZrFaFCZ7eSOj9QUdmVOk7yPSU8xlV0Bp1NK51vnr6G3WQyTIv0NRtcwJ4eCpBd5_c64KyWa3mp3eyqvLJ3-Q0IW4R-
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=INCLUDE%3A+Evaluating+Multilingual+Language+Understanding+with+Regional+Knowledge&rft.jtitle=arXiv.org&rft.au=Romanou%2C+Angelika&rft.au=outan%2C+Negar&rft.au=Sotnikova%2C+Anna&rft.au=Chen%2C+Zeming&rft.date=2024-11-29&rft.pub=Cornell+University+Library%2C+arXiv.org&rft.eissn=2331-8422