Multilingual Mysteries The Art of Automated Language Identification

This research paper presents a comprehensive approach to language detection utilizing neural network with features of natural language processing techniques. The system begins by ingesting audio files, converting them into textual data through audio-to-text conversion processes. Subsequently, a ML o...

Full description

Saved in:
Bibliographic Details
Published in2024 2nd International Conference on Recent Advances in Information Technology for Sustainable Development (ICRAIS) pp. 142 - 147
Main Authors Hegde, Rajalaxmi, Ranjani, Hegde, Sandeep Kumar
Format Conference Proceeding
LanguageEnglish
Published IEEE 06.11.2024
Subjects
Online AccessGet full text
DOI10.1109/ICRAIS62903.2024.10811728

Cover

More Information
Summary:This research paper presents a comprehensive approach to language detection utilizing neural network with features of natural language processing techniques. The system begins by ingesting audio files, converting them into textual data through audio-to-text conversion processes. Subsequently, a ML or neural model that has been trained on the Kaggle language detection dataset then processes and analyzes this textual input. Predicting the input's language is main aim of this work. Natural language processing can reliably identify the language that is detected and evaluated, depending on the material or subject matter given. Understanding what is being stated in any language will be simple after a thorough investigation. When combined with CNN and an attention mechanism, the unidirectional long short term model allows us to detect the language used in a given document. The work encompasses key phases such as data collection, data preprocessing, dataset partitioning for training and testing, model implementation, and the generation of language predictions. The implementation of this system holds considerable technical significance, serving as a valuable tool in applications like automated transcription services, multilingual content analysis, and language-based data processing. The combination of NLP and neural network techniques empowers the system to accurately detect and categorize diverse languages, contributing to its robustness and practicality in real-world scenarios.
DOI:10.1109/ICRAIS62903.2024.10811728