An N-Gram Based Method for Bengali Keyphrase Extraction

Keyphrases provide the subject metadata that gives the clues about the content of a document. In this paper, we present a new method for Bengali keyphrase extraction. The proposed method has several steps such as extraction of n-grams, identification of candidate keyphrases and assigning scores to t...

Full description

Saved in:
Bibliographic Details
Published inInformation Systems for Indian Languages pp. 36 - 41
Main Author Sarkar, Kamal
Format Book Chapter
LanguageEnglish
Published Berlin, Heidelberg Springer Berlin Heidelberg 2011
SeriesCommunications in Computer and Information Science
Subjects
Online AccessGet full text
ISBN9783642194023
3642194028
ISSN1865-0929
1865-0937
DOI10.1007/978-3-642-19403-0_6

Cover

Abstract Keyphrases provide the subject metadata that gives the clues about the content of a document. In this paper, we present a new method for Bengali keyphrase extraction. The proposed method has several steps such as extraction of n-grams, identification of candidate keyphrases and assigning scores to the candidate keyphrases. Since Bengali is a highly inflectional language, we have developed a lightweight stemmer for stemming the candidate keyphrases. The proposed method has been tested on a collection of Bengali documents selected from a Bengali corpus downloadable from TDIL website.
AbstractList Keyphrases provide the subject metadata that gives the clues about the content of a document. In this paper, we present a new method for Bengali keyphrase extraction. The proposed method has several steps such as extraction of n-grams, identification of candidate keyphrases and assigning scores to the candidate keyphrases. Since Bengali is a highly inflectional language, we have developed a lightweight stemmer for stemming the candidate keyphrases. The proposed method has been tested on a collection of Bengali documents selected from a Bengali corpus downloadable from TDIL website.
Author Sarkar, Kamal
Author_xml – sequence: 1
  givenname: Kamal
  surname: Sarkar
  fullname: Sarkar, Kamal
  email: jukamal2001@yahoo.com
  organization: Computer Science & Engineering Department, Jadavpur University, Kolkata, India
BookMark eNpVkEFOwzAQRQ0UiVJyAja-gGHGdhJ72ValIApsYG05jtMGSlzZXcDtMQUhMZuR_pO-9N85GQ1h8IRcIlwhQH2ta8UEqyRnqCUIBqY6IkVORc4OERyTMaqqZKBFffKPcTH6Y1yfkSKlV8hXKlC6HJN6OtBHtoz2nc5s8i198PtNaGkXIp35YW23Pb33n7tNzJQuPvbRun0fhgty2tlt8sXvn5CXm8Xz_JatnpZ38-mKJUReMduh9rVXpW9cU6MCcL5RqtWyQXRSyI7bVre6cx6501xV0lmnRScdCq5BTAj-9KZd7Ie1j6YJ4S0ZBPMtx-SpRpg81hxMmCxHfAELXlK9
ContentType Book Chapter
Copyright Springer-Verlag Berlin Heidelberg 2011
Copyright_xml – notice: Springer-Verlag Berlin Heidelberg 2011
DOI 10.1007/978-3-642-19403-0_6
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISBN 9783642194030
3642194036
EISSN 1865-0937
Editor Goyal, Vishal
Sharma, Dharam Veer
Singh, Chandan
Sengupta, Jyotsna
Singh Lehal, Gurpreet
Editor_xml – sequence: 1
  givenname: Chandan
  surname: Singh
  fullname: Singh, Chandan
  email: chandan.csp@gmail.com
– sequence: 2
  givenname: Gurpreet
  surname: Singh Lehal
  fullname: Singh Lehal, Gurpreet
  email: gslehal@gmail.com
– sequence: 3
  givenname: Jyotsna
  surname: Sengupta
  fullname: Sengupta, Jyotsna
  email: jyotsna.sengupta@gmail.com
– sequence: 4
  givenname: Dharam Veer
  surname: Sharma
  fullname: Sharma, Dharam Veer
  email: dveer72@gmail.com
– sequence: 5
  givenname: Vishal
  surname: Goyal
  fullname: Goyal, Vishal
  email: vishal.pup@gmail.com
EndPage 41
GroupedDBID 29F
ALMA_UNASSIGNED_HOLDINGS
RSU
ID FETCH-LOGICAL-s1126-af19e7e85ebcb71800ceb88d94b11c434f2ad9d9fce12c92864cac93f4c132903
ISBN 9783642194023
3642194028
ISSN 1865-0929
IngestDate Tue Jul 29 20:00:20 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-s1126-af19e7e85ebcb71800ceb88d94b11c434f2ad9d9fce12c92864cac93f4c132903
PageCount 6
ParticipantIDs springer_books_10_1007_978_3_642_19403_0_6
PublicationCentury 2000
PublicationDate 2011
PublicationDateYYYYMMDD 2011-01-01
PublicationDate_xml – year: 2011
  text: 2011
PublicationDecade 2010
PublicationPlace Berlin, Heidelberg
PublicationPlace_xml – name: Berlin, Heidelberg
PublicationSeriesTitle Communications in Computer and Information Science
PublicationSubtitle International Conference, ICISIL 2011, Patiala, India, March 9-11, 2011. Proceedings
PublicationTitle Information Systems for Indian Languages
PublicationYear 2011
Publisher Springer Berlin Heidelberg
Publisher_xml – name: Springer Berlin Heidelberg
SSID ssj0000580895
ssj0000476291
ssib054953581
Score 1.3768916
Snippet Keyphrases provide the subject metadata that gives the clues about the content of a document. In this paper, we present a new method for Bengali keyphrase...
SourceID springer
SourceType Publisher
StartPage 36
SubjectTerms Bengali keyphrase extraction
Information retrieval
Metadata
Title An N-Gram Based Method for Bengali Keyphrase Extraction
URI http://link.springer.com/10.1007/978-3-642-19403-0_6
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Nb9QwELW2ywU4AAVUoCAfOBEZOWsnsY8t2tKWthda1FtkO46EUFNpdyu1_HrGX1m3RUjlkuw6q8SZmcxMZuc9I_SRU6Nr1Vhia04JxFtDFGS1BCJL2UmI95VnvDk-qffP-OF5dT6ZHGRdS1cr_dn8_iuu5H-0CmOgV4eSfYBmx5PCAHwG_cIWNAzbO8nv7TJrbBccgYeJd9z3DB4Mnm3gKBYix5z5u1r8UrGD4iJaT-pUHooT8nWhLopdCGpdceyXlfZn27UDxJCfxTd7A2qHo8X8erUIaIjgkRxT8vI20GQZsIRhvYjYcZxNNriTvN7gUXd5vSHVG4t_0HF5aAgHXwgvpyxzrqKuCJWxwmHzsUD8Ep0oq7NwHGix7jn6vLcDLkXctRihbb2BNhrBp-jRzvzw6EdyLZVro01Mbz5Ic4gCMvteCSr8Gj3jLB0SKN2FCFxN67saCawCR_GdSdz7W91nK6fP0VOHYMEOWgLyf4EmdthEz5I-cJT_JnqSkVG-RM3OgIMVYG8FOFgBBsXhaAV4tAK8toJX6Gxvfvpln8RlNcjS4cWI6ktpGysqq42G1IRSY7UQneS6LA1nvJ-pTnayN7acGTkTNTfKSNZzU7KZpOw1mg6Xg91C2DQK0uFeGkE1b6zWxirYM9VL5R70N-hTkkPrHpRlm1iyQWgta0ForRdaC0J7-5Afv0OP18a5jaarxZV9D-nhSn-Iqv8Dg-FYsw
linkProvider Library Specific Holdings
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.title=Information+Systems+for+Indian+Languages&rft.au=Sarkar%2C+Kamal&rft.atitle=An+N-Gram+Based+Method+for+Bengali+Keyphrase+Extraction&rft.series=Communications+in+Computer+and+Information+Science&rft.date=2011-01-01&rft.pub=Springer+Berlin+Heidelberg&rft.isbn=9783642194023&rft.issn=1865-0929&rft.eissn=1865-0937&rft.spage=36&rft.epage=41&rft_id=info:doi/10.1007%2F978-3-642-19403-0_6
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1865-0929&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1865-0929&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1865-0929&client=summon