An N-Gram Based Method for Bengali Keyphrase Extraction

Keyphrases provide the subject metadata that gives the clues about the content of a document. In this paper, we present a new method for Bengali keyphrase extraction. The proposed method has several steps such as extraction of n-grams, identification of candidate keyphrases and assigning scores to t...

Full description

Saved in:

Bibliographic Details
Published in	Information Systems for Indian Languages pp. 36 - 41
Main Author	Sarkar, Kamal
Format	Book Chapter
Language	English
Published	Berlin, Heidelberg Springer Berlin Heidelberg 2011
Series	Communications in Computer and Information Science
Subjects	Bengali keyphrase extraction Information retrieval Metadata
Online Access	Get full text
ISBN	9783642194023 3642194028
ISSN	1865-0929 1865-0937
DOI	10.1007/978-3-642-19403-0_6

Cover

More Information
Summary:	Keyphrases provide the subject metadata that gives the clues about the content of a document. In this paper, we present a new method for Bengali keyphrase extraction. The proposed method has several steps such as extraction of n-grams, identification of candidate keyphrases and assigning scores to the candidate keyphrases. Since Bengali is a highly inflectional language, we have developed a lightweight stemmer for stemming the candidate keyphrases. The proposed method has been tested on a collection of Bengali documents selected from a Bengali corpus downloadable from TDIL website.
ISBN:	9783642194023 3642194028
ISSN:	1865-0929 1865-0937
DOI:	10.1007/978-3-642-19403-0_6