Cross Language Information Retrieval for Accessing the English Web in Sinhala
Searching the web in Sinhala language does not provide satisfactory results and hence, Sri Lankans who are not fluent in English find it difficult to browse the web for knowledge. This issue can be solved by Cross Language Information Retrieval (CLIR) where the query in Sinhala is matched with docum...
Saved in:
Published in | 2020 20th International Conference on Advances in ICT for Emerging Regions (ICTer) pp. 244 - 249 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
04.11.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Searching the web in Sinhala language does not provide satisfactory results and hence, Sri Lankans who are not fluent in English find it difficult to browse the web for knowledge. This issue can be solved by Cross Language Information Retrieval (CLIR) where the query in Sinhala is matched with documents in English using a query translation approach. This study has experimented with different models which uses the concept of word embeddings to transform the Sinhala query to English where results were retrieved using the Google Search API by providing the equivalent English query obtained. The retrieved results were translated back to Sinhala and re-ranked using two different approaches. A user evaluation showed that re-ranking the results did not show a positive impact but obtaining results using the equivalent English query proved to be effective. Hence this study shows that the quality of the results obtained when searching the web in Sinhala can be improved by performing CLIR. |
---|---|
ISSN: | 2472-7598 |
DOI: | 10.1109/ICTer51097.2020.9325441 |