Specialized Pre-Training of Neural Networks on Synthetic Data for Improving Paraphrase Generation

Paraphrase generation is a fundamental problem in natural language processing. Due to the significant success of transfer learning, the “pre-training → fine-tuning” approach has become the standard. However, popular general pre-training methods typically require extensive datasets and great computat...

Full description

Saved in:

Bibliographic Details
Published in	Cybernetics and systems analysis Vol. 60; no. 2; pp. 167 - 174
Main Authors	Skurzhanskyi, O. H., Marchenko, O. O., Anisimov, A. V.
Format	Journal Article
Language	English
Published	New York Springer US 01.03.2024 Springer Springer Nature B.V
Subjects	Artificial Intelligence Computational linguistics Control Cybernetics Language processing Large language models Machine learning Mathematics Mathematics and Statistics Natural language interfaces Natural language processing Neural networks Processor Architectures Software Engineering/Programming and Operating Systems Synthetic data Systems Theory fine tuning paraphrase generation machine learning pre-training artificial intelligence neural networks
Online Access	Get full text
ISSN	1060-0396 1573-8337
DOI	10.1007/s10559-024-00658-7

Cover

Abstract	Paraphrase generation is a fundamental problem in natural language processing. Due to the significant success of transfer learning, the “pre-training → fine-tuning” approach has become the standard. However, popular general pre-training methods typically require extensive datasets and great computational resources, and the available pre-trained models are limited by fixed architecture and size. The authors have proposed a simple and efficient approach to pre-training specifically for paraphrase generation, which noticeably improves the quality of paraphrase generation and ensures substantial enhancement of general-purpose models. They have used existing public data and new data generated by large language models. The authors have investigated how this pre-training procedure impacts neural networks of various architectures and demonstrated its efficiency across all architectures.
AbstractList	Paraphrase generation is a fundamental problem in natural language processing. Due to the significant success of transfer learning, the "pre-training [right arrow] fine-tuning" approach has become the standard. However, popular general pre-training methods typically require extensive datasets and great computational resources, and the available pre-trained models are limited by fixed architecture and size. The authors have proposed a simple and efficient approach to pre-training specifically for paraphrase generation, which noticeably improves the quality of paraphrase generation and ensures substantial enhancement of general-purpose models. They have used existing public data and new data generated by large language models. The authors have investigated how this pre-training procedure impacts neural networks of various architectures and demonstrated its efficiency across all architectures. Paraphrase generation is a fundamental problem in natural language processing. Due to the significant success of transfer learning, the “pre-training → fine-tuning” approach has become the standard. However, popular general pre-training methods typically require extensive datasets and great computational resources, and the available pre-trained models are limited by fixed architecture and size. The authors have proposed a simple and efficient approach to pre-training specifically for paraphrase generation, which noticeably improves the quality of paraphrase generation and ensures substantial enhancement of general-purpose models. They have used existing public data and new data generated by large language models. The authors have investigated how this pre-training procedure impacts neural networks of various architectures and demonstrated its efficiency across all architectures. Paraphrase generation is a fundamental problem in natural language processing. Due to the significant success of transfer learning, the "pre-training [right arrow] fine-tuning" approach has become the standard. However, popular general pre-training methods typically require extensive datasets and great computational resources, and the available pre-trained models are limited by fixed architecture and size. The authors have proposed a simple and efficient approach to pre-training specifically for paraphrase generation, which noticeably improves the quality of paraphrase generation and ensures substantial enhancement of general-purpose models. They have used existing public data and new data generated by large language models. The authors have investigated how this pre-training procedure impacts neural networks of various architectures and demonstrated its efficiency across all architectures. Keywords: artificial intelligence, machine learning, neural networks, paraphrase generation, pre-training, fine tuning.
Audience	Academic
Author	Anisimov, A. V. Skurzhanskyi, O. H. Marchenko, O. O.
Author_xml	– sequence: 1 givenname: O. H. surname: Skurzhanskyi fullname: Skurzhanskyi, O. H. email: oleksandr.skurzhanskyi@gmail.com organization: Taras Shevchenko National University of Kyiv – sequence: 2 givenname: O. O. surname: Marchenko fullname: Marchenko, O. O. organization: Taras Shevchenko National University of Kyiv – sequence: 3 givenname: A. V. surname: Anisimov fullname: Anisimov, A. V. organization: Taras Shevchenko National University of Kyiv
BookMark	eNp9kUFvFCEUx4mpiW31C3gi8eSB-hhmBjg2VesmjTbd9kzeMrCl7sIKrG399FLHxPRiODwgv9_jhf8ROYgpOkLecjjhAPJD4TAMmkHXM4BxUEy-IId8kIIpIeRB28MIDIQeX5GjUu4AQIBUhwSXO2cDbsIvN9HL7Nh1xhBDXNPk6Ve3z7hppd6n_L3QFOnyMdZbV4OlH7Ei9SnTxXaX088n5RIz7m4zFkfPXXQZa0jxNXnpcVPcm7_1mNx8_nR99oVdfDtfnJ1eMCt6UZkfuOeTbocRUPd2pUbtJz8pb3mvxGplB2lRcBDoHYydx077UU8Dl0NvFYpj8m7u26b5sXelmru0z7E9aQSIHjrdK2jUyUytceNMiD7VjLatyW2DbZ_qQ7s_lbrr5AhKN-H9M6Ex1T3UNe5LMYvl1XO2m1mbUynZebPLYYv50XAwTzmZOSfTcjJ_cjKySWKWSoPj2uV_c__H-g33sZcb
Cites_doi	10.1016/j.aiopen.2021.08.002 10.3115/1073083.1073135 10.48550/arXiv.1610.03098 10.48550/arXiv.1702.03814 10.1609/aaai.v33i01.33016834 10.48550/arXiv.1711.05732 10.18653/v1/W18-64019 10.18653/v1/2021.eacl-main.180 10.18653/v1/2021.acl-long.335 10.18653/v1/2020.emnlp-main.55 10.18653/v1/2020.acl-main.22 10.1162/neco.1997.9.8.1735 10.48550/arXiv.1706.03762 10.48550/arXiv.1903.00138 10.3115/1626355.1626389 10.3115/v1/P15-2070 10.48550/arXiv.2203.02155 10.48550/arXiv.2005.12592 10.48550/arXiv.2006.10369 10.48550/arXiv.2011.14244 10.18653/v1/2021.acl-long.112 10.48550/arXiv.2001.01941 10.18653/v1/2020.acl-main.703 10.48550/arXiv.1405.0312
ContentType	Journal Article
Copyright	Springer Science+Business Media, LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. COPYRIGHT 2024 Springer
Copyright_xml	– notice: Springer Science+Business Media, LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. – notice: COPYRIGHT 2024 Springer
DBID	AAYXX CITATION ISR JQ2
DOI	10.1007/s10559-024-00658-7
DatabaseName	CrossRef Gale In Context: Science (UHCL Subscription) ProQuest Computer Science Collection
DatabaseTitle	CrossRef ProQuest Computer Science Collection
DatabaseTitleList	ProQuest Computer Science Collection
DeliveryMethod	fulltext_linktorsrc
Discipline	Sciences (General) Mathematics
EISSN	1573-8337
EndPage	174
ExternalDocumentID	A792276089 10_1007_s10559_024_00658_7
GroupedDBID	-52 -5D -5G -BR -EM -Y2 -~C .4S .86 .DC .VR 06D 0R~ 0VY 1N0 1SB 2.D 28- 29F 2J2 2JN 2JY 2KG 2LR 2P1 2VQ 2~H 30V 3V. 4.4 406 408 409 40D 40E 5GY 5QI 5VS 642 67Z 6NX 7WY 8AO 8FE 8FG 8FL 8FW 8UJ 95- 95. 95~ 96X AAAVM AABHQ AACDK AAHNG AAIAL AAJBT AAJKR AANZL AARHV AARTL AASML AATNV AATVU AAUYE AAWCG AAYIU AAYQN AAYTO AAYZH ABAKF ABBBX ABBXA ABDZT ABECU ABFTD ABFTV ABHLI ABHQN ABJNI ABJOX ABKCH ABKTR ABMNI ABMQK ABNWP ABQBU ABQSL ABSXP ABTEG ABTHY ABTKH ABTMW ABULA ABUWG ABWNU ABXPI ACAOD ACBXY ACDTI ACGFS ACHSB ACHXU ACKNC ACMDZ ACMLO ACOKC ACOMO ACPIV ACSNA ACZOJ ADHHG ADHIR ADIMF ADINQ ADKNI ADKPE ADRFC ADTPH ADURQ ADYFF ADZKW AEBTG AEFIE AEFQL AEGAL AEGNC AEJHL AEJRE AEKMD AEMSY AEOHA AEPYU AESKC AETLH AEVLU AEXYK AFBBN AFEXP AFFNX AFGCZ AFKRA AFLOW AFQWF AFWTZ AFZKB AGAYW AGDGC AGGDS AGJBK AGMZJ AGQEE AGQMX AGRTI AGWIL AGWZB AGYKE AHAVH AHBYD AHKAY AHSBF AHYZX AIAKS AIGIU AIIXL AILAN AITGF AJBLW AJRNO ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMXSW AMYLF AMYQR AOCGG ARAPS ARCSS ARMRJ ASPBG AVWKF AXYYD AZFZN AZQEC B-. BA0 BAPOH BBWZM BDATZ BENPR BEZIV BGLVJ BGNMA BPHCQ BSONS CAG CCPQU COF CS3 CSCUP DDRTE DL5 DNIVK DPUIP DU5 DWQXO EBLON EBS EDO EIOEI EJD ESBYG F5P FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRNLG FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNUQQ GNWQR GQ6 GQ7 GQ8 GROUPED_ABI_INFORM_COMPLETE GROUPED_ABI_INFORM_RESEARCH GXS H13 HCIFZ HF~ HG6 HMJXF HQYDN HRMNR HVGLF HZ~ I-F IAO IHE IJ- IKXTQ ISR ITM IWAJR IXC IZIGR IZQ I~X I~Z J-C JBSCW JCJTX JZLTJ K60 K6V K6~ K7- KDC KOV KOW LAK LLZTM M0C M0N M4Y MA- MK~ N2Q NB0 NDZJH NPVJJ NQJWS NU0 O9- O93 O9G O9I O9J OAM OVD P19 P62 P9O PF0 PQBIZ PQBZA PQQKQ PROAC PT4 PT5 Q2X QOK QOS R89 R9I RHV RNI RNS ROL RPX RSV RZC RZE RZK S0W S16 S1Z S26 S27 S28 S3B SAP SCLPG SDD SDH SDM SHX SISQX SJYHP SMT SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 T16 TEORI TSG TSK TSV TUC U2A UG4 UOJIU UTJUX UZXMN VC2 VFIZW W23 W48 WH7 WK8 XU3 YLTOR Z7R Z7U Z7X Z7Z Z83 Z88 Z8M Z8R Z8T Z8W Z92 ZMTXR ZWQNP ~A9 ~EX AAPKM AAYXX ABBRH ABDBE ABFSG ACSTC ADHKG ADKFA AEZWR AFDZB AFHIU AFOHR AGQPQ AHPBZ AHWEU AIXLP ATHPR CITATION ICD PHGZM PHGZT AEIIB ABRTQ JQ2
ID	FETCH-LOGICAL-c343t-f51f1d9c3460a94cb869fdfd8fc1483bbc57ca3103afe062fa29f69d51754c8a3
IEDL.DBID	AGYKE
ISSN	1060-0396
IngestDate	Sat Sep 13 16:30:46 EDT 2025 Tue Jun 10 21:02:46 EDT 2025 Fri Jun 27 05:15:28 EDT 2025 Tue Jul 01 00:42:02 EDT 2025 Fri Feb 21 02:40:59 EST 2025
IsPeerReviewed	true
IsScholarly	true
Issue	2
Keywords	fine tuning paraphrase generation machine learning pre-training artificial intelligence neural networks
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c343t-f51f1d9c3460a94cb869fdfd8fc1483bbc57ca3103afe062fa29f69d51754c8a3
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
PQID	3034029480
PQPubID	48839
PageCount	8
ParticipantIDs	proquest_journals_3034029480 gale_infotracacademiconefile_A792276089 gale_incontextgauss_ISR_A792276089 crossref_primary_10_1007_s10559_024_00658_7 springer_journals_10_1007_s10559_024_00658_7
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2024-03-01
PublicationDateYYYYMMDD	2024-03-01
PublicationDate_xml	– month: 03 year: 2024 text: 2024-03-01 day: 01
PublicationDecade	2020
PublicationPlace	New York
PublicationPlace_xml	– name: New York
PublicationTitle	Cybernetics and systems analysis
PublicationTitleAbbrev	Cybern Syst Anal
PublicationYear	2024
Publisher	Springer US Springer Springer Nature B.V
Publisher_xml	– name: Springer US – name: Springer – name: Springer Nature B.V
References	CR19 CR18 CR17 CR16 CR15 CR14 CR13 Han, Zhang, Ding (CR1) 2021; 2 CR12 CR11 CR10 CR2 CR4 CR3 CR6 CR5 CR8 CR7 CR9 CR27 CR26 CR25 CR24 CR23 CR22 CR21 CR20 658_CR7 658_CR20 658_CR8 658_CR5 658_CR6 658_CR24 658_CR23 658_CR9 658_CR22 658_CR21 658_CR27 658_CR26 658_CR25 658_CR13 658_CR12 658_CR11 658_CR10 658_CR17 658_CR16 658_CR15 658_CR14 658_CR19 658_CR18 658_CR3 X Han (658_CR1) 2021; 2 658_CR4 658_CR2
References_xml	– ident: CR22 – ident: CR18 – volume: 2 start-page: 225 year: 2021 end-page: 250 ident: CR1 article-title: Pre-trained models: Past, present and future publication-title: AI Open doi: 10.1016/j.aiopen.2021.08.002 – ident: CR4 – ident: CR14 – ident: CR2 – ident: CR16 – ident: CR12 – ident: CR10 – ident: CR6 – ident: CR8 – ident: CR25 – ident: CR27 – ident: CR23 – ident: CR21 – ident: CR19 – ident: CR3 – ident: CR15 – ident: CR17 – ident: CR13 – ident: CR11 – ident: CR9 – ident: CR5 – ident: CR7 – ident: CR26 – ident: CR24 – ident: CR20 – ident: 658_CR9 doi: 10.3115/1073083.1073135 – ident: 658_CR10 – ident: 658_CR18 doi: 10.48550/arXiv.1610.03098 – ident: 658_CR12 – ident: 658_CR14 – volume: 2 start-page: 225 year: 2021 ident: 658_CR1 publication-title: AI Open doi: 10.1016/j.aiopen.2021.08.002 – ident: 658_CR8 doi: 10.48550/arXiv.1702.03814 – ident: 658_CR19 doi: 10.1609/aaai.v33i01.33016834 – ident: 658_CR5 doi: 10.48550/arXiv.1711.05732 – ident: 658_CR13 doi: 10.18653/v1/W18-64019 – ident: 658_CR17 doi: 10.18653/v1/2021.eacl-main.180 – ident: 658_CR22 doi: 10.18653/v1/2021.acl-long.335 – ident: 658_CR24 doi: 10.18653/v1/2020.emnlp-main.55 – ident: 658_CR25 doi: 10.18653/v1/2020.acl-main.22 – ident: 658_CR15 doi: 10.1162/neco.1997.9.8.1735 – ident: 658_CR16 doi: 10.48550/arXiv.1706.03762 – ident: 658_CR2 doi: 10.48550/arXiv.1903.00138 – ident: 658_CR11 doi: 10.3115/1626355.1626389 – ident: 658_CR20 doi: 10.3115/v1/P15-2070 – ident: 658_CR6 doi: 10.48550/arXiv.2203.02155 – ident: 658_CR3 doi: 10.48550/arXiv.2005.12592 – ident: 658_CR4 doi: 10.48550/arXiv.2006.10369 – ident: 658_CR27 doi: 10.48550/arXiv.2011.14244 – ident: 658_CR26 doi: 10.18653/v1/2021.acl-long.112 – ident: 658_CR23 doi: 10.48550/arXiv.2001.01941 – ident: 658_CR21 doi: 10.18653/v1/2020.acl-main.703 – ident: 658_CR7 doi: 10.48550/arXiv.1405.0312
SSID	ssj0003078
Score	2.2785783
Snippet	Paraphrase generation is a fundamental problem in natural language processing. Due to the significant success of transfer learning, the “pre-training →... Paraphrase generation is a fundamental problem in natural language processing. Due to the significant success of transfer learning, the "pre-training [right...
SourceID	proquest gale crossref springer
SourceType	Aggregation Database Index Database Publisher
StartPage	167
SubjectTerms	Artificial Intelligence Computational linguistics Control Cybernetics Language processing Large language models Machine learning Mathematics Mathematics and Statistics Natural language interfaces Natural language processing Neural networks Processor Architectures Software Engineering/Programming and Operating Systems Synthetic data Systems Theory
Title	Specialized Pre-Training of Neural Networks on Synthetic Data for Improving Paraphrase Generation
URI	https://link.springer.com/article/10.1007/s10559-024-00658-7 https://www.proquest.com/docview/3034029480
Volume	60
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwEB7R7QUOQAuIhbKyEBIgcJWH49jHLXQpoK4q2pXKyXL8QAgpizbZA_31jBNnSwsceoziTBzP2PM5nvkG4IXx3mcul7R0wlKWGkZlZTk1uWYFwlNd2pA7fDznRwv26bw4j0lhzRDtPhxJdiv1H8luiH4p-hTa-U1absF2kQopRrA9_fD18-FmBUa77VPgeEKTXPKYLPNvKVcc0vVl-a_z0c7tzO7BYuhwH23yY3_dVvvm4hqX402_6D7cjTiUTHvD2YFbrt6FO8cbEtdmF3bivG_Iq0hO_foB6Fiw_vuFs-Rk5ehZrDFBlp4Epg8UOu9DyxuyrMnprxpFokDyXreaIEQmm_8Y5EQHuuwVOlLSvyFYyUNYzA7P3h3RWKYBFcrylvoi9amVeMETLZmpBJfeeiu8wb1WXlWmKI0O9cy0dwnPvM6k59IWiFyYETp_BKN6WbvHQMo004kVrBJeM3xYcmuES6ssd6lLUzmGN4Ou1M-ejUNd8i6H0VQ4mqobTVWO4XlQpwo0F3WIo_mm102jPp5-UdNSZlnJE4EiX8ZGftmutNExLQE7FJixrrTcG8xCxYneKEQAuAOXTCRjeDto-fL2_zv35GbNn8LtrDOUEP22B6N2tXbPEA611QStf3ZwMJ_EWTCBrUU2_Q1mMwIJ
linkProvider	Springer Nature
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwEB7B9gAcgBYQ2xawEBIgcJWH48THFbRsaXdV0V2pnCzHD4SQstUme6C_vuPE2dIChx6jOBPHM_Z8jme-AXijnXOJTQXNbWEoizWjojSc6lSxDOGpyo3PHZ5M-XjOvp5lZyEprO6j3fsjyXal_iPZDdEvRZ9CW79J87uwwXAPHg1gY_Tl-9H-egVGu-1S4HhEo1TwkCzzbynXHNLNZfmv89HW7Rw8gnnf4S7a5Nfeqin39MUNLsfbftFjeBhwKBl1hrMJd2y1BQ8maxLXegs2w7yvybtATv3-CahQsP7nhTXkZGnpLNSYIAtHPNMHCp12oeU1WVTk9HeFIlEg-awaRRAik_V_DHKiPF32Eh0p6d7greQpzA_2Z5_GNJRpQIWytKEui11sBF7wSAmmy4ILZ5wpnMa9VlqWOsu18vXMlLMRT5xKhOPCZIhcmC5U-gwG1aKyz4HkcaIiU7CycIrhw4IbXdi4TFIb2zgWQ_jQ60qed2wc8op32Y-mxNGU7WjKfAivvTqlp7mofBzND7Wqa3l4-k2OcpEkOY8KFPk2NHKLZqm0CmkJ2CHPjHWt5W5vFjJM9FoiAsAduGBFNISPvZavbv-_c9u3a_4K7o1nk2N5fDg92oH7SWs0PhJuFwbNcmVfIDRqypdhJlwCTyACgQ
linkToPdf	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwELZKKyE4IFpAbCnUqiqVCqyNHcexjyvaVVvoakW7Um-W4wfikq026aH8esaJd_sADhyjOBPL39gztme-QWjfhhCYzxUpvXSEU8uJqpwgNje8APfUlC7mDp9PxMmMn10VV_ey-Lto9-WVZJ_TEFma6nZ47cLwXuIbeMIE7AvpbCgpn6ANWI5p1PQZG63WYtDgPhlOZCTLlUhpM3-X8cA0PV6g_7gp7QzQ-CV6kTxHPOqh3kRrvt5Cz89XtKvNFtpMM7XBHxOd9OErZFKJ-Z-_vMPThSeXqSoEngccuTlA6KQPBm_wvMYXtzWIBIH4yLQGg1OLVycPeGoiwfUCTB_u_xBxfY1m4-PLLyckFVYACHjeklDQQJ2CB5EZxW0lhQouOBks7I7yqrJFaU2sQGaCzwQLhqkglCvA1-BWmvwNWq_ntX-LcEmZyZzklQyGw8dKOCs9rVjuqadUDdCn5Zjq654_Q98xJUcENCCgOwR0OUB7cdh1JKaoY-TLD3PTNPr04rselYqxUmQSRB6kRmHeLow1KZEAOhS5rB603FnCp9PUbDTYbNgzKy6zAfq8hPTu9b87t_1_zXfR0-nRWH87nXx9h56xTs9i6NoOWm8XN_49-DJt9aFT19_5dOmy
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Specialized+Pre-Training+of+Neural+Networks+on+Synthetic+Data+for+Improving+Paraphrase+Generation&rft.jtitle=Cybernetics+and+systems+analysis&rft.au=Skurzhanskyi%2C+O.+H.&rft.au=Marchenko%2C+O.+O.&rft.au=Anisimov%2C+A.+V.&rft.date=2024-03-01&rft.pub=Springer+US&rft.issn=1060-0396&rft.eissn=1573-8337&rft.volume=60&rft.issue=2&rft.spage=167&rft.epage=174&rft_id=info:doi/10.1007%2Fs10559-024-00658-7&rft.externalDocID=10_1007_s10559_024_00658_7
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1060-0396&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1060-0396&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1060-0396&client=summon