A Multi-Metric Model for analyzing and comparing extractive text summarization approaches and algorithms on scientific papers

In today's world, where data and information are increasingly proliferating, text summarization and technologies play a critical role in making large amounts of text data more accessible and meaningful. In business, the news industry, academic research, and many other fields, text summarization...

Full description

Saved in:
Bibliographic Details
Published inDÜMF Mühendislik Dergisi
Main Authors Dursun, Mehmet Ali, Serttaş, Soydan
Format Journal Article
LanguageTurkish
Published 18.02.2024
Online AccessGet full text

Cover

Loading…
Abstract In today's world, where data and information are increasingly proliferating, text summarization and technologies play a critical role in making large amounts of text data more accessible and meaningful. In business, the news industry, academic research, and many other fields, text summarization helps make quick decisions, access information faster, and manage resources more effectively. Additionally, text summarization research is conducted to further improve these technologies and develop new methods and algorithms to provide better summarization of texts. Therefore, text summarization and research in this field are of great importance in the information age. In this study, a new operating model for text summarization that can be applied to different algorithms is proposed and evaluated. Sixteen summarization algorithms covering six approaches (statistical, graph-based, content-based, pointer-based, position-based, and user-oriented) were implemented and tested on 50 different full-text article datasets. Four evaluation criteria (BLEU, Rouge-N, Rouge-L, METEOR) were used to assess the similarity between the generated summaries and the original summaries. The performance of the algorithms within each approach was averaged and the overall best-performing algorithm was selected. This best algorithm was subjected to further analysis through Topic Modelling and Keyword Extraction to identify key topics and keywords within the summarised text. The proposed model provides a standardized workflow for developing and thoroughly testing summarization algorithms across datasets and evaluation metrics to determine the most appropriate summarization approach. This study demonstrates the effectiveness of the model on a variety of algorithm types and text sources.
AbstractList In today's world, where data and information are increasingly proliferating, text summarization and technologies play a critical role in making large amounts of text data more accessible and meaningful. In business, the news industry, academic research, and many other fields, text summarization helps make quick decisions, access information faster, and manage resources more effectively. Additionally, text summarization research is conducted to further improve these technologies and develop new methods and algorithms to provide better summarization of texts. Therefore, text summarization and research in this field are of great importance in the information age. In this study, a new operating model for text summarization that can be applied to different algorithms is proposed and evaluated. Sixteen summarization algorithms covering six approaches (statistical, graph-based, content-based, pointer-based, position-based, and user-oriented) were implemented and tested on 50 different full-text article datasets. Four evaluation criteria (BLEU, Rouge-N, Rouge-L, METEOR) were used to assess the similarity between the generated summaries and the original summaries. The performance of the algorithms within each approach was averaged and the overall best-performing algorithm was selected. This best algorithm was subjected to further analysis through Topic Modelling and Keyword Extraction to identify key topics and keywords within the summarised text. The proposed model provides a standardized workflow for developing and thoroughly testing summarization algorithms across datasets and evaluation metrics to determine the most appropriate summarization approach. This study demonstrates the effectiveness of the model on a variety of algorithm types and text sources.
Author Serttaş, Soydan
Dursun, Mehmet Ali
Author_xml – sequence: 1
  givenname: Mehmet Ali
  orcidid: 0000-0001-6370-1160
  surname: Dursun
  fullname: Dursun, Mehmet Ali
– sequence: 2
  givenname: Soydan
  orcidid: 0000-0001-8887-8675
  surname: Serttaş
  fullname: Serttaş, Soydan
BookMark eNqVj71uAjEQhF0QKZBQpvcLHLHxcQdlFBGluS69ZXw2rOQ_eX1RQMq7xyBeINXOanZm9S3ILMRgCHnhbLVuGV-_jpO3Ky76btdvZ2TOBds1265lj2SJCAe2aUW76XoxJ79vdJhcgWYwJYOmQxyNozZmqoJy5wuEY1Uj1dEnla-b-SlZ6QLfhpaqKU7eV-eiCsRAVUo5Kn0yeIspd4wZyskjrSZqMKGArX-SSibjM3mwyqFZ3ucTaT72X--fjc4RMRsrU4bafpacyRuavKLJO5r47_0fBHBeBw
ContentType Journal Article
DBID AAYXX
CITATION
DOI 10.24012/dumf.1376978
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList CrossRef
DeliveryMethod fulltext_linktorsrc
ExternalDocumentID 10_24012_dumf_1376978
GroupedDBID AAYXX
CITATION
EN8
ID FETCH-crossref_primary_10_24012_dumf_13769783
ISSN 1309-8640
IngestDate Fri Aug 23 01:38:18 EDT 2024
IsPeerReviewed true
IsScholarly true
Language Turkish
LinkModel OpenURL
MergedId FETCHMERGED-crossref_primary_10_24012_dumf_13769783
ORCID 0000-0001-8887-8675
0000-0001-6370-1160
ParticipantIDs crossref_primary_10_24012_dumf_1376978
PublicationCentury 2000
PublicationDate 2024-02-18
PublicationDateYYYYMMDD 2024-02-18
PublicationDate_xml – month: 02
  year: 2024
  text: 2024-02-18
  day: 18
PublicationDecade 2020
PublicationTitle DÜMF Mühendislik Dergisi
PublicationYear 2024
SSID ssib054345673
ssib044734483
Score 4.6024795
Snippet In today's world, where data and information are increasingly proliferating, text summarization and technologies play a critical role in making large amounts...
SourceID crossref
SourceType Aggregation Database
Title A Multi-Metric Model for analyzing and comparing extractive text summarization approaches and algorithms on scientific papers
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnZ1LTwIxEMcbxYsXo1HjOz2ol80iS_fVIyKEmOAJE26bfXTFyCvrYgKJfnZn2n0A4YBeNmWXNg2_zTBt_zNDyG09dFlkxIHOWMh103Ei3WeWobOAg3uMQgxZbKL7Yndezee-1c9L3GfRJWlQDRcb40r-QxXuAVeMkv0D2WJQuAFt4AtXIAzXrRg3NBk_q3exLFYo65oNM1mkP5wv8vhDpTPHT2CJZVTUl9BQ8aGp0LUsFLPILy5U3mZ_-DZJ3tPBSJ4oqMhJFBZpU3-a6eZzt_YJz9t5s9vWuth6bA7EOHoHF_YDDFqCeyOlw5x8zpR-WAxGItUaw-IZ2K0UnNmmdcfbcld2Mo-ylzfbl6ibKGVeMaV4duPaKhnTuqEGP8LAzK_RbBRXDTByXBXyWU2IvfZHVcgHYeEiB_Cwu5d13yV7dcyWj_LNn1ZuU0zTYbACLWwQRtJathQeFBNUqVfliA_LE1pyVZZ8jt4hOcgWC7ShyB-RnTQ5Jt8NukydSuoUqNOCOrQiWlCnJXWK1OkKdVpSl91K6hQeltSpon5C9Har1-zo-Zy9qcpX4m38udgpqYwnY3FGKLdhlcmFHwcWM_2aEzihxWPX5G5oBTXOzsn9dmNebPvFS7JfvjJXpJImM3ENzl0a3Eh6vxSbVsc
link.rule.ids 315,786,790,27946,27947
linkProvider ISSN International Centre
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Multi-Metric+Model+for+analyzing+and+comparing+extractive+text+summarization+approaches+and+algorithms+on+scientific+papers&rft.jtitle=D%C3%9CMF+M%C3%BChendislik+Dergisi&rft.au=Dursun%2C+Mehmet+Ali&rft.au=Sertta%C5%9F%2C+Soydan&rft.date=2024-02-18&rft.issn=1309-8640&rft_id=info:doi/10.24012%2Fdumf.1376978&rft.externalDBID=n%2Fa&rft.externalDocID=10_24012_dumf_1376978
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1309-8640&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1309-8640&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1309-8640&client=summon