ParCoLab: A Parallel Corpus for Serbian, French and English

ParCoLab is a trilingual parallel corpus containing texts in Serbian, French and English. It is developed at the CLLE-ERSS research unit (UMR 5263 CNRS) at the University of Toulouse, France, in collaboration with the Department of Romance Studies at the University of Belgrade, Serbia. Serbian being...

Full description

Saved in:
Bibliographic Details
Published inText, Speech, and Dialogue Vol. 10415; pp. 156 - 164
Main Authors Miletic, Aleksandra, Stosic, Dejan, Marjanović, Saša
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 2017
Springer International Publishing
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:ParCoLab is a trilingual parallel corpus containing texts in Serbian, French and English. It is developed at the CLLE-ERSS research unit (UMR 5263 CNRS) at the University of Toulouse, France, in collaboration with the Department of Romance Studies at the University of Belgrade, Serbia. Serbian being one of the less-resourced European languages, this is an important step towards the creation of freely accessible corpora and NLP tools for this language. Our main goal is to provide the scientific community with a high-quality resource that can be used in a wide range of applications, such as contrastive linguistic studies, NLP research, machine and computer assisted translation, translation studies, second language learning and teaching, and applied lexicography. The corpus currently contains 7.1M tokens mainly from literary works, but corpus extension and diversification efforts are ongoing. ParCoLab can be queried online and a part of it is available for download.
ISBN:3319642057
9783319642055
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-319-64206-2_18