A Survey of Current Datasets for Code-Switching Research
Code switching is a prevalent phenomenon in the multilingual community and social media interaction. In the past ten years, we have witnessed an explosion of code switched data in the social media that brings together languages from low resourced languages to high resourced languages in the same tex...
Saved in:
Published in | 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS) pp. 136 - 141 |
---|---|
Main Authors | , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.03.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Code switching is a prevalent phenomenon in the multilingual community and social media interaction. In the past ten years, we have witnessed an explosion of code switched data in the social media that brings together languages from low resourced languages to high resourced languages in the same text, sometimes written in a non-native script. This increases the demand for processing code-switched data to assist users in various natural language processing tasks such as part-of-speech tagging, named entity recognition, sentiment analysis, conversational systems, and machine translation, etc. The available corpora for code switching research played a major role in advancing this area of research. In this paper, we propose a set of quality metrics to evaluate the dataset and categorize them accordingly. |
---|---|
AbstractList | Code switching is a prevalent phenomenon in the multilingual community and social media interaction. In the past ten years, we have witnessed an explosion of code switched data in the social media that brings together languages from low resourced languages to high resourced languages in the same text, sometimes written in a non-native script. This increases the demand for processing code-switched data to assist users in various natural language processing tasks such as part-of-speech tagging, named entity recognition, sentiment analysis, conversational systems, and machine translation, etc. The available corpora for code switching research played a major role in advancing this area of research. In this paper, we propose a set of quality metrics to evaluate the dataset and categorize them accordingly. |
Author | Suryawanshi, Shardul McCrae, John P. Sherly, Elizabeth Chakravarthi, Bharathi Raja Jose, Navya |
Author_xml | – sequence: 1 givenname: Navya surname: Jose fullname: Jose, Navya organization: Indian Institute of Information Technology and Management-Kerala,Machine Intelligence,Trivandrum,India – sequence: 2 givenname: Bharathi Raja surname: Chakravarthi fullname: Chakravarthi, Bharathi Raja organization: Data Science Institute, National University of Ireland,Galway,Ireland – sequence: 3 givenname: Shardul surname: Suryawanshi fullname: Suryawanshi, Shardul organization: Data Science Institute, National University of Ireland,Galway,Ireland – sequence: 4 givenname: Elizabeth surname: Sherly fullname: Sherly, Elizabeth organization: Indian Institute of Information Technology and Management-Kerala,Machine Intelligence,Trivandrum,India – sequence: 5 givenname: John P. surname: McCrae fullname: McCrae, John P. organization: Data Science Institute, National University of Ireland,Galway,Ireland |
BookMark | eNpFj9tKAzEUReMNbGu_wJf8wNRzksntsUSrQkFwFHwrMXPGjuiMJFOlf--ABZ82rA2btafstOs7YowjLBDBXd37pfdVaQ2ohQABCwemFKCO2BSNsKjQmZdjNhHKqGIE9uS_0HjO5jm_A4BE66y1E2aXvNqlb9rzvuF-lxJ1A78OQ8g0ZN70ifu-pqL6aYe4bbs3_kiZQorbC3bWhI9M80PO2PPq5snfFeuH21FyXbRCyaGodaxB2lejqRmBgIhaxwaD0AIwkJMagyNZjoYylJFqRyJoFGCjKrGWM3b5t9sS0eYrtZ8h7TeH1_IXngpK5g |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/ICACCS48705.2020.9074205 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library Online IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEL url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
EISBN | 172815197X 9781728151977 |
EISSN | 2575-7288 |
EndPage | 141 |
ExternalDocumentID | 9074205 |
Genre | orig-research |
GroupedDBID | 6IE 6IF 6IL 6IN AAJGR ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK OCL RIE RIL |
ID | FETCH-LOGICAL-i253t-d6cd038b76ef25320c166cf1a26201ae9361a9e348153a4ced9e2a61208c541d3 |
IEDL.DBID | RIE |
ISBN | 1728151961 9781728151960 |
IngestDate | Wed Jun 26 19:26:59 EDT 2024 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i253t-d6cd038b76ef25320c166cf1a26201ae9361a9e348153a4ced9e2a61208c541d3 |
OpenAccessLink | https://aran.library.nuigalway.ie/bitstream/10379/16090/5/A_Survey_of_Current_Datasets_for_Code_Switching_research.pdf |
PageCount | 6 |
ParticipantIDs | ieee_primary_9074205 |
PublicationCentury | 2000 |
PublicationDate | 2020-March |
PublicationDateYYYYMMDD | 2020-03-01 |
PublicationDate_xml | – month: 03 year: 2020 text: 2020-March |
PublicationDecade | 2020 |
PublicationTitle | 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS) |
PublicationTitleAbbrev | ICACCS |
PublicationYear | 2020 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0003189888 |
Score | 2.2022226 |
Snippet | Code switching is a prevalent phenomenon in the multilingual community and social media interaction. In the past ten years, we have witnessed an explosion of... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 136 |
SubjectTerms | code switching dataset Measurement Natural language processing Social network services Switches Tagging Task analysis Vocabulary |
Title | A Survey of Current Datasets for Code-Switching Research |
URI | https://ieeexplore.ieee.org/document/9074205 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFA5zJ08qm_ibHDyaLk2btDmOqkxhIszBbiNNXmEIq8xW0b_eJG0nigd7agstSV_hey_5vu8hdCl4Yg8WE6NMRGITGuLCTFiahklhM3TlFXLTBzGZx_cLvuihq60WBgA8-QwCd-r38k2pa7dUNvKFnDMs3UmkbLRa2_UU-29KW8057VbCUotjUoStpVN3TTsmD5Wju2ycZTObrVNui0RGg_bdP5qseIy53UPTbnQNteQ5qKs80J-_jBv_O_x9NPxW8-HHLU4doB6sBygd41m9eYMPXBa4NWnC16qyoFa9YpvJ4qw0QGbvq8qzLXFH0Rui-e3NUzYhbRcFsmI8qogR2tAozRMBBXNtIHQohC5C5azoQwUyEjYg4AS5PFKxBiOBKZv40FTzODTRIeqvyzUcIWyfNEYyDqATW1bHykCRqpwmUAhuIn6MBm7Wy5fGKGPZTvjk79unaNd9-YbQdYb61aaGc4vwVX7hQ_sFs-SfAQ |
link.rule.ids | 310,311,783,787,792,793,799,23942,23943,25152,27937,55086 |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFA5jHvSksom_zcGj6dq0SdPjqMqm2xC2wW4jTV5hCKvMVtG_3qQ_JooHe2oDLX28wvte-n3fQ-ias9AcNCBaap8E2tPEpplQIbwwNQhdlgq58YQP5sHDgi1a6GarhQGAknwGjj0t_-XrTBV2q6xXNnLWsHTH4GrBK7XWdkfFfJ2R6eeseiukwlSyiHu1qVNz7TZcHjfqDeN-HE8NXneZaROp69RP_zFmpawy9_to3LxfRS55doo8cdTnL-vG_wZwgLrfej78tK1Uh6gF6w4SfTwtNm_wgbMU1zZN-Fbmpqzlr9hgWRxnGsj0fZWXfEvckPS6aH5_N4sHpJ6jQFaU-TnRXGnXF0nIIaV2EITyOFepJ60ZvSch8rlJCVhJLvNloEBHQKWBPq5QLPC0f4Ta62wNxwibO7WOKANQoWmsA6khFTJxQ0g50z47QR0b9fKlsspY1gGf_r18hXYHs_FoORpOHs_Qns1CRe86R-18U8CFqfd5clmm-Qs8j6JM |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2020+6th+International+Conference+on+Advanced+Computing+and+Communication+Systems+%28ICACCS%29&rft.atitle=A+Survey+of+Current+Datasets+for+Code-Switching+Research&rft.au=Jose%2C+Navya&rft.au=Chakravarthi%2C+Bharathi+Raja&rft.au=Suryawanshi%2C+Shardul&rft.au=Sherly%2C+Elizabeth&rft.date=2020-03-01&rft.pub=IEEE&rft.isbn=1728151961&rft.eissn=2575-7288&rft.spage=136&rft.epage=141&rft_id=info:doi/10.1109%2FICACCS48705.2020.9074205&rft.externalDocID=9074205 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781728151960/lc.gif&client=summon&freeimage=true |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781728151960/mc.gif&client=summon&freeimage=true |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781728151960/sc.gif&client=summon&freeimage=true |