An N-Gram Based Method for Bengali Keyphrase Extraction
Keyphrases provide the subject metadata that gives the clues about the content of a document. In this paper, we present a new method for Bengali keyphrase extraction. The proposed method has several steps such as extraction of n-grams, identification of candidate keyphrases and assigning scores to t...
Saved in:
Published in | Information Systems for Indian Languages pp. 36 - 41 |
---|---|
Main Author | |
Format | Book Chapter |
Language | English |
Published |
Berlin, Heidelberg
Springer Berlin Heidelberg
2011
|
Series | Communications in Computer and Information Science |
Subjects | |
Online Access | Get full text |
ISBN | 9783642194023 3642194028 |
ISSN | 1865-0929 1865-0937 |
DOI | 10.1007/978-3-642-19403-0_6 |
Cover
Abstract | Keyphrases provide the subject metadata that gives the clues about the content of a document. In this paper, we present a new method for Bengali keyphrase extraction. The proposed method has several steps such as extraction of n-grams, identification of candidate keyphrases and assigning scores to the candidate keyphrases. Since Bengali is a highly inflectional language, we have developed a lightweight stemmer for stemming the candidate keyphrases. The proposed method has been tested on a collection of Bengali documents selected from a Bengali corpus downloadable from TDIL website. |
---|---|
AbstractList | Keyphrases provide the subject metadata that gives the clues about the content of a document. In this paper, we present a new method for Bengali keyphrase extraction. The proposed method has several steps such as extraction of n-grams, identification of candidate keyphrases and assigning scores to the candidate keyphrases. Since Bengali is a highly inflectional language, we have developed a lightweight stemmer for stemming the candidate keyphrases. The proposed method has been tested on a collection of Bengali documents selected from a Bengali corpus downloadable from TDIL website. |
Author | Sarkar, Kamal |
Author_xml | – sequence: 1 givenname: Kamal surname: Sarkar fullname: Sarkar, Kamal email: jukamal2001@yahoo.com organization: Computer Science & Engineering Department, Jadavpur University, Kolkata, India |
BookMark | eNpVkEFOwzAQRQ0UiVJyAja-gGHGdhJ72ValIApsYG05jtMGSlzZXcDtMQUhMZuR_pO-9N85GQ1h8IRcIlwhQH2ta8UEqyRnqCUIBqY6IkVORc4OERyTMaqqZKBFffKPcTH6Y1yfkSKlV8hXKlC6HJN6OtBHtoz2nc5s8i198PtNaGkXIp35YW23Pb33n7tNzJQuPvbRun0fhgty2tlt8sXvn5CXm8Xz_JatnpZ38-mKJUReMduh9rVXpW9cU6MCcL5RqtWyQXRSyI7bVre6cx6501xV0lmnRScdCq5BTAj-9KZd7Ie1j6YJ4S0ZBPMtx-SpRpg81hxMmCxHfAELXlK9 |
ContentType | Book Chapter |
Copyright | Springer-Verlag Berlin Heidelberg 2011 |
Copyright_xml | – notice: Springer-Verlag Berlin Heidelberg 2011 |
DOI | 10.1007/978-3-642-19403-0_6 |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Computer Science |
EISBN | 9783642194030 3642194036 |
EISSN | 1865-0937 |
Editor | Goyal, Vishal Sharma, Dharam Veer Singh, Chandan Sengupta, Jyotsna Singh Lehal, Gurpreet |
Editor_xml | – sequence: 1 givenname: Chandan surname: Singh fullname: Singh, Chandan email: chandan.csp@gmail.com – sequence: 2 givenname: Gurpreet surname: Singh Lehal fullname: Singh Lehal, Gurpreet email: gslehal@gmail.com – sequence: 3 givenname: Jyotsna surname: Sengupta fullname: Sengupta, Jyotsna email: jyotsna.sengupta@gmail.com – sequence: 4 givenname: Dharam Veer surname: Sharma fullname: Sharma, Dharam Veer email: dveer72@gmail.com – sequence: 5 givenname: Vishal surname: Goyal fullname: Goyal, Vishal email: vishal.pup@gmail.com |
EndPage | 41 |
GroupedDBID | 29F ALMA_UNASSIGNED_HOLDINGS RSU |
ID | FETCH-LOGICAL-s1126-af19e7e85ebcb71800ceb88d94b11c434f2ad9d9fce12c92864cac93f4c132903 |
ISBN | 9783642194023 3642194028 |
ISSN | 1865-0929 |
IngestDate | Tue Jul 29 20:00:20 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-s1126-af19e7e85ebcb71800ceb88d94b11c434f2ad9d9fce12c92864cac93f4c132903 |
PageCount | 6 |
ParticipantIDs | springer_books_10_1007_978_3_642_19403_0_6 |
PublicationCentury | 2000 |
PublicationDate | 2011 |
PublicationDateYYYYMMDD | 2011-01-01 |
PublicationDate_xml | – year: 2011 text: 2011 |
PublicationDecade | 2010 |
PublicationPlace | Berlin, Heidelberg |
PublicationPlace_xml | – name: Berlin, Heidelberg |
PublicationSeriesTitle | Communications in Computer and Information Science |
PublicationSubtitle | International Conference, ICISIL 2011, Patiala, India, March 9-11, 2011. Proceedings |
PublicationTitle | Information Systems for Indian Languages |
PublicationYear | 2011 |
Publisher | Springer Berlin Heidelberg |
Publisher_xml | – name: Springer Berlin Heidelberg |
SSID | ssj0000580895 ssj0000476291 ssib054953581 |
Score | 1.3768916 |
Snippet | Keyphrases provide the subject metadata that gives the clues about the content of a document. In this paper, we present a new method for Bengali keyphrase... |
SourceID | springer |
SourceType | Publisher |
StartPage | 36 |
SubjectTerms | Bengali keyphrase extraction Information retrieval Metadata |
Title | An N-Gram Based Method for Bengali Keyphrase Extraction |
URI | http://link.springer.com/10.1007/978-3-642-19403-0_6 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Nb9QwELW2ywU4AAVUoCAfOBEZOWsnsY8t2tKWthda1FtkO46EUFNpdyu1_HrGX1m3RUjlkuw6q8SZmcxMZuc9I_SRU6Nr1Vhia04JxFtDFGS1BCJL2UmI95VnvDk-qffP-OF5dT6ZHGRdS1cr_dn8_iuu5H-0CmOgV4eSfYBmx5PCAHwG_cIWNAzbO8nv7TJrbBccgYeJd9z3DB4Mnm3gKBYix5z5u1r8UrGD4iJaT-pUHooT8nWhLopdCGpdceyXlfZn27UDxJCfxTd7A2qHo8X8erUIaIjgkRxT8vI20GQZsIRhvYjYcZxNNriTvN7gUXd5vSHVG4t_0HF5aAgHXwgvpyxzrqKuCJWxwmHzsUD8Ep0oq7NwHGix7jn6vLcDLkXctRihbb2BNhrBp-jRzvzw6EdyLZVro01Mbz5Ic4gCMvteCSr8Gj3jLB0SKN2FCFxN67saCawCR_GdSdz7W91nK6fP0VOHYMEOWgLyf4EmdthEz5I-cJT_JnqSkVG-RM3OgIMVYG8FOFgBBsXhaAV4tAK8toJX6Gxvfvpln8RlNcjS4cWI6ktpGysqq42G1IRSY7UQneS6LA1nvJ-pTnayN7acGTkTNTfKSNZzU7KZpOw1mg6Xg91C2DQK0uFeGkE1b6zWxirYM9VL5R70N-hTkkPrHpRlm1iyQWgta0ForRdaC0J7-5Afv0OP18a5jaarxZV9D-nhSn-Iqv8Dg-FYsw |
linkProvider | Library Specific Holdings |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.title=Information+Systems+for+Indian+Languages&rft.au=Sarkar%2C+Kamal&rft.atitle=An+N-Gram+Based+Method+for+Bengali+Keyphrase+Extraction&rft.series=Communications+in+Computer+and+Information+Science&rft.date=2011-01-01&rft.pub=Springer+Berlin+Heidelberg&rft.isbn=9783642194023&rft.issn=1865-0929&rft.eissn=1865-0937&rft.spage=36&rft.epage=41&rft_id=info:doi/10.1007%2F978-3-642-19403-0_6 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1865-0929&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1865-0929&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1865-0929&client=summon |