Topic Categorization of Tamil News Articles

This project aims to develop a model in deep learning for 'TOPIC CATEGORIZATION OF TAMIL NEWS ARTICLES'. The process of learning for regional languages such as Tamil, Hindi differs from that of English. The task of categorising Tamil news article by topic is a text classification task. Mos...

Full description

Saved in:
Bibliographic Details
Published in2022 International Conference on Computer Communication and Informatics (ICCCI) pp. 1 - 6
Main Authors Selvi, C.S.Kanimozhi, N, Induja, L, Lekshmi S, S, Nagammai
Format Conference Proceeding
LanguageEnglish
Published IEEE 25.01.2022
Subjects
Online AccessGet full text
DOI10.1109/ICCCI54379.2022.9741061

Cover

Loading…
Abstract This project aims to develop a model in deep learning for 'TOPIC CATEGORIZATION OF TAMIL NEWS ARTICLES'. The process of learning for regional languages such as Tamil, Hindi differs from that of English. The task of categorising Tamil news article by topic is a text classification task. Most of the applications, such as searching in web, filtering information, recognising language, readability rating, and analysis of sentiment, already use text categorization. These tasks heavily rely on neural networks. We did multiclass text classification in this study. Different techniques, ranging from machine learning to Deep Learning, are used to handle almost all NLP difficulties. Even so, language localization remains a mystery. For languages other than English, NLP issues are uncertain. Entity Extraction, prediction, classification, and OCR in sequence modelling are examples of difficulties. Because the number of individuals utilising local languages like Tamil, Hindi, Telegu in social media network is growing, it's critical to generate automatic classification process. The goal is to categorise Tamil news stories into similar subjects (Sports, Cinema, Politics). NB, CNN, SVM algorithms are implemented to classify the content of Tamil News articles based on the category done in the previous work. The performance of these algorithms are compared with RNN using Precision, Recall, F1-score are reported in this study.
AbstractList This project aims to develop a model in deep learning for 'TOPIC CATEGORIZATION OF TAMIL NEWS ARTICLES'. The process of learning for regional languages such as Tamil, Hindi differs from that of English. The task of categorising Tamil news article by topic is a text classification task. Most of the applications, such as searching in web, filtering information, recognising language, readability rating, and analysis of sentiment, already use text categorization. These tasks heavily rely on neural networks. We did multiclass text classification in this study. Different techniques, ranging from machine learning to Deep Learning, are used to handle almost all NLP difficulties. Even so, language localization remains a mystery. For languages other than English, NLP issues are uncertain. Entity Extraction, prediction, classification, and OCR in sequence modelling are examples of difficulties. Because the number of individuals utilising local languages like Tamil, Hindi, Telegu in social media network is growing, it's critical to generate automatic classification process. The goal is to categorise Tamil news stories into similar subjects (Sports, Cinema, Politics). NB, CNN, SVM algorithms are implemented to classify the content of Tamil News articles based on the category done in the previous work. The performance of these algorithms are compared with RNN using Precision, Recall, F1-score are reported in this study.
Author L, Lekshmi S
Selvi, C.S.Kanimozhi
N, Induja
S, Nagammai
Author_xml – sequence: 1
  givenname: C.S.Kanimozhi
  surname: Selvi
  fullname: Selvi, C.S.Kanimozhi
  email: kec.kanimozhi@gmail.com
  organization: Kongu Engineering College,Department of Information Technology,Erode,Tamilnadu,India
– sequence: 2
  givenname: Induja
  surname: N
  fullname: N, Induja
  email: indujanainamalai@gmail.com
  organization: Kongu Engineering College,Department of Computer Science and Engineering,India
– sequence: 3
  givenname: Lekshmi S
  surname: L
  fullname: L, Lekshmi S
  email: Iekshmi.mani74@gmail.com
  organization: Kongu Engineering College,Department of Computer Science and Engineering,India
– sequence: 4
  givenname: Nagammai
  surname: S
  fullname: S, Nagammai
  email: snagammai2101@gmail.com
  organization: Kongu Engineering College,Department of Computer Science and Engineering,India
BookMark eNotzr1OwzAUQGEjwUALT8CAd5Tg6-vfsbKARqpgyV7dGBtZSuMqiYTg6RnodLZPZ8Oupzolxh5BtADCP3chhE4rtL6VQsrWWwXCwBXbgDFaOYEabtlTX88l8kBr-qpz-aW11InXzHs6lZG_p--F7-a1xDEtd-wm07ik-0u3rH996cO-OXy8dWF3aIoD0URAVDZHKcmYKG3UNGSFaLMWTubBaYc5GxWt91FF0uBx0BANuM9ERLhlD_9sSSkdz3M50fxzvOzjH2aYPso
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ICCCI54379.2022.9741061
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 1665480351
9781665480352
EndPage 6
ExternalDocumentID 9741061
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i810-c13347fc22a66c27c5abf4337f5082fb8583ff64c799c4ca5193b51c618deaaa3
IEDL.DBID RIE
IngestDate Thu Jun 29 18:36:55 EDT 2023
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i810-c13347fc22a66c27c5abf4337f5082fb8583ff64c799c4ca5193b51c618deaaa3
PageCount 6
ParticipantIDs ieee_primary_9741061
PublicationCentury 2000
PublicationDate 2022-Jan.-25
PublicationDateYYYYMMDD 2022-01-25
PublicationDate_xml – month: 01
  year: 2022
  text: 2022-Jan.-25
  day: 25
PublicationDecade 2020
PublicationTitle 2022 International Conference on Computer Communication and Informatics (ICCCI)
PublicationTitleAbbrev ICCCI
PublicationYear 2022
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.7857523
Snippet This project aims to develop a model in deep learning for 'TOPIC CATEGORIZATION OF TAMIL NEWS ARTICLES'. The process of learning for regional languages such as...
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms Deep learning
Predictive models
Social networking (online)
Speech recognition
Support vector machines
Text categorization
Text recognition
Title Topic Categorization of Tamil News Articles
URI https://ieeexplore.ieee.org/document/9741061
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEB7anjyptOKbHLzpbrt57fYoi6UVKh5W6K0kswkUoVtKe_HXm9ldK4oHbyEk5J3JTOb7BuAuNeWIWMCjcJp4JL3GyHj6QkTrgviTIvOERp6_6OmbfF6oRQceDlgY51ztfOZiStZ_-WWFezKVDUNd0mC60A3brMFqtS5byWg8nOV5PlPErxfUPs7jtvSPsCm11Jgcw_yrvcZZ5D3e72yMH7-oGP_boRMYfOPz2OtB8pxCx637cF9UmxWynKgfqm0Lr2SVZwXZMBjdZuyxdYMbQDF5KvJp1IZCiFZZuCgxaJIy9ci50Rp5ispYL4VIfXhfcW8zlQnvtcR0PEaJhp5lViWok6x0xhhxBr11tXbnwFBrJ0qROKSI8T4xpRXeqVC1dMIn-gL6NM7lpiG7WLZDvPw7-wqOaK7JJsHVNfR22727CVJ6Z2_r5fkEyOGSvw
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED6VMsAEqEW88cAGSRu_ko4oomqhrRiC1K1yLrZUITVV1S78enxpKAIxsEVWLL9kf77zfd8B3MWm6JIKeOB3Ew-k0xgYR0-ImFsPf1IkjtjI44kevMnnqZo24GHHhbHWVsFnNqTP6i2_KHFDrrKOr0sWzB7se9yXasvWqoO2om6vM0zTdKhIYc8bfpyH9f8_EqdUuNE_gvFXi9twkfdws85D_PglxvjfLh1D-5uhx1532HMCDbtowX1WLufIUhJ_KFc1wZKVjmXkxWB0nrHHOhCuDVn_KUsHQZ0MIZgn_qhEb0vK2CHnRmvkMSqTOylE7PwNi7s8UYlwTkuMez2UaOhilqsIdZQU1hgjTqG5KBf2DBhqbUUhIouUM95FpsiFs8pXLaxwkT6HFo1zttzKXczqIV78XXwLB4NsPJqNhpOXSzikeScPBVdX0FyvNvbaY_Y6v6mW6hOr95YM
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2022+International+Conference+on+Computer+Communication+and+Informatics+%28ICCCI%29&rft.atitle=Topic+Categorization+of+Tamil+News+Articles&rft.au=Selvi%2C+C.S.Kanimozhi&rft.au=N%2C+Induja&rft.au=L%2C+Lekshmi+S&rft.au=S%2C+Nagammai&rft.date=2022-01-25&rft.pub=IEEE&rft.spage=1&rft.epage=6&rft_id=info:doi/10.1109%2FICCCI54379.2022.9741061&rft.externalDocID=9741061