A Multi-Metric Model for analyzing and comparing extractive text summarization approaches and algorithms on scientific papers

In today's world, where data and information are increasingly proliferating, text summarization and technologies play a critical role in making large amounts of text data more accessible and meaningful. In business, the news industry, academic research, and many other fields, text summarization...

Full description

Saved in:

Bibliographic Details
Published in	DÜMF Mühendislik Dergisi
Main Authors	Dursun, Mehmet Ali, Serttaş, Soydan
Format	Journal Article
Language	Turkish
Published	18.02.2024
Online Access	Get full text

Cover

Loading…

Abstract	In today's world, where data and information are increasingly proliferating, text summarization and technologies play a critical role in making large amounts of text data more accessible and meaningful. In business, the news industry, academic research, and many other fields, text summarization helps make quick decisions, access information faster, and manage resources more effectively. Additionally, text summarization research is conducted to further improve these technologies and develop new methods and algorithms to provide better summarization of texts. Therefore, text summarization and research in this field are of great importance in the information age. In this study, a new operating model for text summarization that can be applied to different algorithms is proposed and evaluated. Sixteen summarization algorithms covering six approaches (statistical, graph-based, content-based, pointer-based, position-based, and user-oriented) were implemented and tested on 50 different full-text article datasets. Four evaluation criteria (BLEU, Rouge-N, Rouge-L, METEOR) were used to assess the similarity between the generated summaries and the original summaries. The performance of the algorithms within each approach was averaged and the overall best-performing algorithm was selected. This best algorithm was subjected to further analysis through Topic Modelling and Keyword Extraction to identify key topics and keywords within the summarised text. The proposed model provides a standardized workflow for developing and thoroughly testing summarization algorithms across datasets and evaluation metrics to determine the most appropriate summarization approach. This study demonstrates the effectiveness of the model on a variety of algorithm types and text sources.
AbstractList	In today's world, where data and information are increasingly proliferating, text summarization and technologies play a critical role in making large amounts of text data more accessible and meaningful. In business, the news industry, academic research, and many other fields, text summarization helps make quick decisions, access information faster, and manage resources more effectively. Additionally, text summarization research is conducted to further improve these technologies and develop new methods and algorithms to provide better summarization of texts. Therefore, text summarization and research in this field are of great importance in the information age. In this study, a new operating model for text summarization that can be applied to different algorithms is proposed and evaluated. Sixteen summarization algorithms covering six approaches (statistical, graph-based, content-based, pointer-based, position-based, and user-oriented) were implemented and tested on 50 different full-text article datasets. Four evaluation criteria (BLEU, Rouge-N, Rouge-L, METEOR) were used to assess the similarity between the generated summaries and the original summaries. The performance of the algorithms within each approach was averaged and the overall best-performing algorithm was selected. This best algorithm was subjected to further analysis through Topic Modelling and Keyword Extraction to identify key topics and keywords within the summarised text. The proposed model provides a standardized workflow for developing and thoroughly testing summarization algorithms across datasets and evaluation metrics to determine the most appropriate summarization approach. This study demonstrates the effectiveness of the model on a variety of algorithm types and text sources.
Author	Serttaş, Soydan Dursun, Mehmet Ali
Author_xml	– sequence: 1 givenname: Mehmet Ali orcidid: 0000-0001-6370-1160 surname: Dursun fullname: Dursun, Mehmet Ali – sequence: 2 givenname: Soydan orcidid: 0000-0001-8887-8675 surname: Serttaş fullname: Serttaş, Soydan
BookMark	eNqVj71uAjEQhF0QKZBQpvcLHLHxcQdlFBGluS69ZXw2rOQ_eX1RQMq7xyBeINXOanZm9S3ILMRgCHnhbLVuGV-_jpO3Ky76btdvZ2TOBds1265lj2SJCAe2aUW76XoxJ79vdJhcgWYwJYOmQxyNozZmqoJy5wuEY1Uj1dEnla-b-SlZ6QLfhpaqKU7eV-eiCsRAVUo5Kn0yeIspd4wZyskjrSZqMKGArX-SSibjM3mwyqFZ3ucTaT72X--fjc4RMRsrU4bafpacyRuavKLJO5r47_0fBHBeBw
ContentType	Journal Article
DBID	AAYXX CITATION
DOI	10.24012/dumf.1376978
DatabaseName	CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList	CrossRef
DeliveryMethod	fulltext_linktorsrc
ExternalDocumentID	10_24012_dumf_1376978
GroupedDBID	AAYXX CITATION EN8
ID	FETCH-crossref_primary_10_24012_dumf_13769783
ISSN	1309-8640
IngestDate	Fri Aug 23 01:38:18 EDT 2024
IsPeerReviewed	true
IsScholarly	true
Language	Turkish
LinkModel	OpenURL
MergedId	FETCHMERGED-crossref_primary_10_24012_dumf_13769783
ORCID	0000-0001-8887-8675 0000-0001-6370-1160
ParticipantIDs	crossref_primary_10_24012_dumf_1376978
PublicationCentury	2000
PublicationDate	2024-02-18
PublicationDateYYYYMMDD	2024-02-18
PublicationDate_xml	– month: 02 year: 2024 text: 2024-02-18 day: 18
PublicationDecade	2020
PublicationTitle	DÜMF Mühendislik Dergisi
PublicationYear	2024
SSID	ssib054345673 ssib044734483
Score	4.6024795
Snippet	In today's world, where data and information are increasingly proliferating, text summarization and technologies play a critical role in making large amounts...
SourceID	crossref
SourceType	Aggregation Database
Title	A Multi-Metric Model for analyzing and comparing extractive text summarization approaches and algorithms on scientific papers
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnZ1LTwIxEMcbxYsXo1HjOz2ol80iS_fVIyKEmOAJE26bfXTFyCvrYgKJfnZn2n0A4YBeNmWXNg2_zTBt_zNDyG09dFlkxIHOWMh103Ei3WeWobOAg3uMQgxZbKL7Yndezee-1c9L3GfRJWlQDRcb40r-QxXuAVeMkv0D2WJQuAFt4AtXIAzXrRg3NBk_q3exLFYo65oNM1mkP5wv8vhDpTPHT2CJZVTUl9BQ8aGp0LUsFLPILy5U3mZ_-DZJ3tPBSJ4oqMhJFBZpU3-a6eZzt_YJz9t5s9vWuth6bA7EOHoHF_YDDFqCeyOlw5x8zpR-WAxGItUaw-IZ2K0UnNmmdcfbcld2Mo-ylzfbl6ibKGVeMaV4duPaKhnTuqEGP8LAzK_RbBRXDTByXBXyWU2IvfZHVcgHYeEiB_Cwu5d13yV7dcyWj_LNn1ZuU0zTYbACLWwQRtJathQeFBNUqVfliA_LE1pyVZZ8jt4hOcgWC7ShyB-RnTQ5Jt8NukydSuoUqNOCOrQiWlCnJXWK1OkKdVpSl91K6hQeltSpon5C9Har1-zo-Zy9qcpX4m38udgpqYwnY3FGKLdhlcmFHwcWM_2aEzihxWPX5G5oBTXOzsn9dmNebPvFS7JfvjJXpJImM3ENzl0a3Eh6vxSbVsc
link.rule.ids	315,786,790,27946,27947
linkProvider	ISSN International Centre
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Multi-Metric+Model+for+analyzing+and+comparing+extractive+text+summarization+approaches+and+algorithms+on+scientific+papers&rft.jtitle=D%C3%9CMF+M%C3%BChendislik+Dergisi&rft.au=Dursun%2C+Mehmet+Ali&rft.au=Sertta%C5%9F%2C+Soydan&rft.date=2024-02-18&rft.issn=1309-8640&rft_id=info:doi/10.24012%2Fdumf.1376978&rft.externalDBID=n%2Fa&rft.externalDocID=10_24012_dumf_1376978
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1309-8640&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1309-8640&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1309-8640&client=summon