An Efficient Text Summarization Using Term and Inverse Frequency With Key Phrase Identification in Malayalam Language
Malayalam is a morphologically rich language. Indian languages have several language genres like Indo-Aryan, Sino-Tibetan, and Dravidian languages, where Malayalam comes under the Dravidian language genres. Text summarization in Indian languages is hard because of their rich content and lack of easy...
Saved in:
Published in | 2021 IEEE International Women in Engineering (WIE) Conference on Electrical and Computer Engineering (WIECON-ECE) pp. 145 - 148 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
04.12.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Malayalam is a morphologically rich language. Indian languages have several language genres like Indo-Aryan, Sino-Tibetan, and Dravidian languages, where Malayalam comes under the Dravidian language genres. Text summarization in Indian languages is hard because of their rich content and lack of easy availability of annotated data. Here we propose a summarization system for Malayalam language documents based on the Term Frequency - Inverse Document Frequency (TF-IDF) scheme. Our summarizer will accept a single Malayalam text document as input, then by using the TF-IDF measurement scheme with keyword identification a summary document is generated. The proposed method successfully summarized Malayalam literature documents with 90.6% accuracy. |
---|---|
DOI: | 10.1109/WIECON-ECE54711.2021.9829671 |