The impact of memory on learning sequence-to-sequence tasks

The recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks. While there exists a rich literature that studies classification and regression tasks using solvable models of neural networks, seq2seq tasks have not ye...

Full description

Saved in:

Bibliographic Details
Published in	Machine learning: science and technology Vol. 5; no. 1; pp. 15053 - 15068
Main Authors	Seif, Alireza, Loos, Sarah A M, Tucci, Gennaro, Roldán, Édgar, Goldt, Sebastian
Format	Journal Article
Language	English
Published	Bristol IOP Publishing 01.03.2024
Subjects	Artificial neural networks auto-regressive model Autoregressive models Learning memory Memory tasks Natural language processing Neural networks non-Markovianity Performance degradation Performance enhancement recurrent neural network sequence-to-sequence task statistical physics
Online Access	Get full text

Cover

Loading…

Abstract	The recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks. While there exists a rich literature that studies classification and regression tasks using solvable models of neural networks, seq2seq tasks have not yet been studied from this perspective. Here, we propose a simple model for a seq2seq task that has the advantage of providing explicit control over the degree of memory, or non-Markovianity, in the sequences—the stochastic switching-Ornstein–Uhlenbeck (SSOU) model. We introduce a measure of non-Markovianity to quantify the amount of memory in the sequences. For a minimal auto-regressive (AR) learning model trained on this task, we identify two learning regimes corresponding to distinct phases in the stationary state of the SSOU process. These phases emerge from the interplay between two different time scales that govern the sequence statistics. Moreover, we observe that while increasing the integration window of the AR model always improves performance, albeit with diminishing returns, increasing the non-Markovianity of the input sequences can improve or degrade its performance. Finally, we perform experiments with recurrent and convolutional neural networks that show that our observations carry over to more complicated neural network architectures.
AbstractList	The recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks. While there exists a rich literature that studies classification and regression tasks using solvable models of neural networks, seq2seq tasks have not yet been studied from this perspective. Here, we propose a simple model for a seq2seq task that has the advantage of providing explicit control over the degree of memory, or non-Markovianity, in the sequences—the stochastic switching-Ornstein–Uhlenbeck (SSOU) model. We introduce a measure of non-Markovianity to quantify the amount of memory in the sequences. For a minimal auto-regressive (AR) learning model trained on this task, we identify two learning regimes corresponding to distinct phases in the stationary state of the SSOU process. These phases emerge from the interplay between two different time scales that govern the sequence statistics. Moreover, we observe that while increasing the integration window of the AR model always improves performance, albeit with diminishing returns, increasing the non-Markovianity of the input sequences can improve or degrade its performance. Finally, we perform experiments with recurrent and convolutional neural networks that show that our observations carry over to more complicated neural network architectures.
Author	Loos, Sarah A M Goldt, Sebastian Seif, Alireza Roldán, Édgar Tucci, Gennaro
Author_xml	– sequence: 1 givenname: Alireza orcidid: 0000-0001-5419-5999 surname: Seif fullname: Seif, Alireza organization: Pritzker School of Molecular Engineering, University of Chicago , Chicago, IL 60637, United States of America – sequence: 2 givenname: Sarah A M orcidid: 0000-0002-5946-5684 surname: Loos fullname: Loos, Sarah A M organization: University of Cambridge DAMTP, Centre for Mathematical Sciences, Cambridge CB3 0WA, United Kingdom – sequence: 3 givenname: Gennaro surname: Tucci fullname: Tucci, Gennaro organization: Max Planck Institute for Dynamics and Self-Organization , Göttingen, Germany – sequence: 4 givenname: Édgar orcidid: 0000-0001-7196-8404 surname: Roldán fullname: Roldán, Édgar organization: ICTP—The Abdus Salam International Centre for Theoretical Physics , Trieste, Italy – sequence: 5 givenname: Sebastian orcidid: 0000-0002-5799-7644 surname: Goldt fullname: Goldt, Sebastian organization: International School of Advanced Studies (SISSA) , Trieste, Italy
BookMark	eNp9UEtLAzEYDFLBWnv3uODVtXnsZhM8SfFRKHip55DNo6bubmoSD_33bl2rIujpG4aZYb45BaPOdwaAcwSvEGRshinBOUYlmUmNramPwPiLGv3AJ2Aa4wZCiEtESgzH4Hr1bDLXbqVKmbdZa1ofdpnvssbI0LlunUXz-mY6ZfLk8wPOkowv8QwcW9lEM_28E_B0d7uaP-TLx_vF_GaZqwLRlHNDKyXL0jJCrdWaV4zzmknGeCUhYlJppJGyjGpSG2TKmlFUYEZITSDlmkzAYsjVXm7ENrhWhp3w0okPwoe1kCE51RhBCqgrVEDOcVmUmjJEVV2jCkFqC67qPutiyNoG3_8Sk9j4t9D19QXmFUGM8gL2KjqoVPAxBmOFckkm57sUpGsEgmK_u9gPK_bDimH33gh_GQ91_7FcDhbnt99l_pS_A4-Mk_8
CODEN	MLSTCK
CitedBy_id	crossref_primary_10_1103_PhysRevResearch_6_023057
Cites_doi	10.1103/PhysRevLett.129.030603 10.1103/PhysRevLett.127.198101 10.1063/1.4986932 10.1103/PhysRevE.96.012101 10.1103/RevModPhys.91.045002 10.1016/j.neuron.2009.07.018 10.1038/s41567-019-0445-4 10.1038/s41534-020-0251-y 10.1103/PhysRevX.10.041044 10.1103/PhysRevA.81.062115 10.1073/pnas.2018422118 10.1103/PhysRevResearch.3.L022018 10.1162/neco.1989.1.4.541 10.3389/fphy.2019.00182 10.1109/TSSC.1969.300225 10.1103/PhysRevA.45.6056 10.1088/2632-2153/ac4f3f 10.1162/neco.1997.9.8.1735 10.1103/PhysRevLett.121.040601 10.1002/j.1537-2197.1987.tb08741.x 10.1103/PhysRevE.87.032159 10.1016/j.neuron.2018.07.003 10.1088/0305-4470/22/12/004 10.1103/PhysRevE.102.032209 10.1103/PhysRevA.89.042120 10.1088/1367-2630/ac0f18 10.1103/PhysRevLett.105.050403 10.1038/ncomms14106 10.1103/PhysRevX.8.031003 10.1103/PhysRevA.104.032212 10.1088/1742-5468/ab33fa 10.1103/PhysRevLett.61.259 10.1175/JCLI-D-17-0559.1 10.1088/1742-5468/abc61d 10.1364/AO.51.005522
ContentType	Journal Article
Copyright	2024 The Author(s). Published by IOP Publishing Ltd 2024 The Author(s). Published by IOP Publishing Ltd. This work is published under http://creativecommons.org/licenses/by/4.0 (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml	– notice: 2024 The Author(s). Published by IOP Publishing Ltd – notice: 2024 The Author(s). Published by IOP Publishing Ltd. This work is published under http://creativecommons.org/licenses/by/4.0 (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID	O3W TSCCA AAYXX CITATION 3V. 7XB 88I 8FE 8FG 8FK ABUWG AFKRA ARAPS AZQEC BENPR BGLVJ CCPQU DWQXO GNUQQ HCIFZ JQ2 K7- M2P P62 PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI Q9U DOA
DOI	10.1088/2632-2153/ad2feb
DatabaseName	Institute of Physics Open Access Journal Titles IOPscience (Open Access) CrossRef ProQuest Central (Corporate) ProQuest Central (purchase pre-March 2016) Science Database (Alumni Edition) ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) ProQuest Central (Alumni) ProQuest Central UK/Ireland Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One Community College ProQuest Central ProQuest Central Student ProQuest SciTech Premium Collection ProQuest Computer Science Collection Computer Science Database (Proquest) Science Database ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central Basic DOAJ Directory of Open Access Journals
DatabaseTitle	CrossRef Publicly Available Content Database Computer Science Database ProQuest Central Student Technology Collection ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Central ProQuest One Applied & Life Sciences ProQuest Central Korea ProQuest Central (New) Advanced Technologies & Aerospace Collection ProQuest Science Journals (Alumni Edition) ProQuest Central Basic ProQuest Science Journals ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest SciTech Collection ProQuest One Academic UKI Edition ProQuest One Academic ProQuest One Academic (New) ProQuest Central (Alumni)
DatabaseTitleList	CrossRef Publicly Available Content Database
Database_xml	– sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: O3W name: Institute of Physics Open Access Journal Titles url: http://iopscience.iop.org/ sourceTypes: Publisher – sequence: 3 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISSN	2632-2153
ExternalDocumentID	oai_doaj_org_article_340d7140992545d6816cbb17106f49cb 10_1088_2632_2153_ad2feb mlstad2feb
GrantInformation_xml	– fundername: Chicago Prize Postdoctoral Fellowship
GroupedDBID	88I ABHWH ABUWG ACHIP AFKRA AKPSB ALMA_UNASSIGNED_HOLDINGS ARAPS AZQEC BENPR BGLVJ CCPQU CJUJL DWQXO EBS GNUQQ GROUPED_DOAJ HCIFZ IOP K7- M2P M~E N5L O3W OK1 PIMPY TSCCA AAYXX CITATION PHGZM PHGZT 3V. 7XB 8FE 8FG 8FK JQ2 P62 PKEHL PQEST PQGLB PQQKQ PQUKI Q9U AEINN PUEGO
ID	FETCH-LOGICAL-c416t-9e67ca55f836ffdd97899b8a8897a018acd1d1cf86d3be1e5b86142833b3069d3
IEDL.DBID	O3W
ISSN	2632-2153
IngestDate	Wed Aug 27 01:25:24 EDT 2025 Sun Jul 13 03:04:47 EDT 2025 Tue Jul 01 01:08:57 EDT 2025 Thu Apr 24 23:07:59 EDT 2025 Sun Aug 18 16:10:27 EDT 2024
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	1
Language	English
License	Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c416t-9e67ca55f836ffdd97899b8a8897a018acd1d1cf86d3be1e5b86142833b3069d3
Notes	MLST-101241.R1 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0002-5946-5684 0000-0002-5799-7644 0000-0001-7196-8404 0000-0001-5419-5999
OpenAccessLink	https://iopscience.iop.org/article/10.1088/2632-2153/ad2feb
PQID	2973186940
PQPubID	4916454
PageCount	16
ParticipantIDs	doaj_primary_oai_doaj_org_article_340d7140992545d6816cbb17106f49cb crossref_citationtrail_10_1088_2632_2153_ad2feb crossref_primary_10_1088_2632_2153_ad2feb proquest_journals_2973186940 iop_journals_10_1088_2632_2153_ad2feb
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2024-03-01
PublicationDateYYYYMMDD	2024-03-01
PublicationDate_xml	– month: 03 year: 2024 text: 2024-03-01 day: 01
PublicationDecade	2020
PublicationPlace	Bristol
PublicationPlace_xml	– name: Bristol
PublicationTitle	Machine learning: science and technology
PublicationTitleAbbrev	MLST
PublicationTitleAlternate	Mach. Learn.: Sci. Technol
PublicationYear	2024
Publisher	IOP Publishing
Publisher_xml	– name: IOP Publishing
References	Dosovitskiy (mlstad2febbib16) 2021 Brown (mlstad2febbib4) 2020; vol 33 Engel (mlstad2febbib10) 2001 Lapolla (mlstad2febbib45) 2019; 7 Gardner (mlstad2febbib8) 1989; 22 Brückner (mlstad2febbib62) 2019; 15 Cho (mlstad2febbib40) 2014 Seung (mlstad2febbib9) 1992; 45 Hall (mlstad2febbib47) 2014; 89 Majumder (mlstad2febbib65) 2020; 6 Martinez (mlstad2febbib35) 2012; 51 Huang (mlstad2febbib49) 2021; 104 Pope (mlstad2febbib17) 2021 Sussillo (mlstad2febbib54) 2009; 63 Goodfellow (mlstad2febbib38) 2016 Glorot (mlstad2febbib66) 2010 Richards (mlstad2febbib22) 2021 Vettoretti (mlstad2febbib58) 2018; 31 Goldt (mlstad2febbib19) 2020; 10 Kantz (mlstad2febbib6) 2004; vol 7 Benna (mlstad2febbib28) 2021; 118 Mindlin (mlstad2febbib57) 2017; 27 Belousov (mlstad2febbib61) 2020; 102 Lapolla (mlstad2febbib44) 2021; 3 Rivas (mlstad2febbib48) 2010; 105 Strasberg (mlstad2febbib50) 2018; 121 Ghorbani (mlstad2febbib30) 2019; vol 32 Mavadia (mlstad2febbib64) 2017; 8 Roldán (mlstad2febbib60) 2021; 23 Howard (mlstad2febbib2) 2018 Cavallaro (mlstad2febbib59) 2019 Kingma (mlstad2febbib41) 2014 Goldt (mlstad2febbib20) 2022 Fukushima (mlstad2febbib39) 1969; 5 LeCun (mlstad2febbib37) 1989; 1 Martínez (mlstad2febbib36) 2013; 87 OpenAI (mlstad2febbib5) 2023 Mezard (mlstad2febbib11) 2009 Sompolinsky (mlstad2febbib53) 1988; 61 Gerace (mlstad2febbib29) 2022; 3 Van Kampen (mlstad2febbib34) 1992 Pietzonka (mlstad2febbib32) 2017; 96 Devlin (mlstad2febbib1) 2019 Chizat (mlstad2febbib23) 2020 Spigler (mlstad2febbib26) 2020 Seif (mlstad2febbib42) 2022 Laine (mlstad2febbib46) 2010; 81 Refinetti (mlstad2febbib24) 2021 Simonyan (mlstad2febbib14) 2015 Box (mlstad2febbib7) 2015 Chung (mlstad2febbib18) 2018; 8 Radford (mlstad2febbib3) 2018 Ghorbani (mlstad2febbib21) 2020; vol 33 Krizhevsky (mlstad2febbib13) 2012 He (mlstad2febbib15) 2016 Di Terlizzi (mlstad2febbib33) 2023 Loureiro (mlstad2febbib25) 2021; vol 34 Mastrogiuseppe (mlstad2febbib55) 2018; 99 Tucci (mlstad2febbib31) 2022; 129 Skinner (mlstad2febbib63) 2021; 127 Hochreiter (mlstad2febbib52) 1997; 9 Ellison (mlstad2febbib51) 1987; 74 Kloeden (mlstad2febbib43) 1992 d’Ascoli (mlstad2febbib27) 2021; vol 34 Carleo (mlstad2febbib12) 2019; 91 Schuessler (mlstad2febbib56) 2020; vol 33
References_xml	– volume: 129 year: 2022 ident: mlstad2febbib31 article-title: Modeling active non-Markovian oscillations publication-title: Phys. Rev. Lett. doi: 10.1103/PhysRevLett.129.030603 – volume: 127 year: 2021 ident: mlstad2febbib63 article-title: Estimating entropy production from waiting time distributions publication-title: Phys. Rev. Lett. doi: 10.1103/PhysRevLett.127.198101 – volume: 27 year: 2017 ident: mlstad2febbib57 article-title: Nonlinear dynamics in the study of birdsong publication-title: Chaos doi: 10.1063/1.4986932 – volume: 96 year: 2017 ident: mlstad2febbib32 article-title: Finite-time generalization of the thermodynamic uncertainty relation publication-title: Phys. Rev. E doi: 10.1103/PhysRevE.96.012101 – year: 1992 ident: mlstad2febbib34 – volume: 91 year: 2019 ident: mlstad2febbib12 article-title: Machine learning and the physical sciences publication-title: Rev. Mod. Phys. doi: 10.1103/RevModPhys.91.045002 – volume: 63 start-page: 544 year: 2009 ident: mlstad2febbib54 article-title: Generating coherent patterns of activity from chaotic neural networks publication-title: Neuron doi: 10.1016/j.neuron.2009.07.018 – start-page: pp 103 year: 2014 ident: mlstad2febbib40 article-title: On the properties of neural machine translation: encoder–decoder approaches – volume: vol 33 start-page: pp 13352 year: 2020 ident: mlstad2febbib56 article-title: The interplay between randomness and structure during learning in RNNs – volume: 15 start-page: 595 year: 2019 ident: mlstad2febbib62 article-title: Stochastic nonlinear dynamics of confined cell migration in two-state systems publication-title: Nat. Phys. doi: 10.1038/s41567-019-0445-4 – year: 2009 ident: mlstad2febbib11 – year: 2018 ident: mlstad2febbib2 – volume: 6 start-page: 1 year: 2020 ident: mlstad2febbib65 article-title: Real-time calibration with spectator qubits publication-title: npj Quantum Inf. doi: 10.1038/s41534-020-0251-y – volume: 10 year: 2020 ident: mlstad2febbib19 article-title: Modeling the influence of data structure on learning in neural networks: the hidden manifold model publication-title: Phys. Rev. X doi: 10.1103/PhysRevX.10.041044 – start-page: pp 1305 year: 2020 ident: mlstad2febbib23 article-title: Implicit bias of gradient descent for wide two-layer neural networks trained with the logistic loss – start-page: pp 249 year: 2010 ident: mlstad2febbib66 article-title: Understanding the difficulty of training deep feedforward neural networks – volume: vol 32 start-page: pp 9111 year: 2019 ident: mlstad2febbib30 article-title: Limitations of lazy training of two-layers neural network – year: 2018 ident: mlstad2febbib3 article-title: Improving language understanding by generative pre-training – year: 2022 ident: mlstad2febbib42 article-title: Code for data generation and training and testing machine learning models – volume: 81 year: 2010 ident: mlstad2febbib46 article-title: Measure for the non-Markovianity of quantum processes publication-title: Phys. Rev. A doi: 10.1103/PhysRevA.81.062115 – volume: 118 year: 2021 ident: mlstad2febbib28 article-title: Place cells may simply be memory cells: memory compression leads to spatial tuning and history dependence publication-title: Proc. Natl Acad. Sci. doi: 10.1073/pnas.2018422118 – year: 2023 ident: mlstad2febbib33 article-title: Variance sum rule for entropy production – volume: 3 year: 2021 ident: mlstad2febbib44 article-title: Toolbox for quantifying memory in dynamics along reaction coordinates publication-title: Phys. Rev. Res. doi: 10.1103/PhysRevResearch.3.L022018 – start-page: pp 1097 year: 2012 ident: mlstad2febbib13 article-title: ImageNet classification with deep convolutional neural networks – volume: 1 start-page: 541 year: 1989 ident: mlstad2febbib37 article-title: Backpropagation applied to handwritten zip code recognition publication-title: Neural Comput. doi: 10.1162/neco.1989.1.4.541 – volume: 7 start-page: 182 year: 2019 ident: mlstad2febbib45 article-title: Manifestations of projection-induced memory: general theory and the tilted single file publication-title: Front. Phys. doi: 10.3389/fphy.2019.00182 – start-page: pp 3889 year: 2021 ident: mlstad2febbib22 article-title: Asymptotics of ridge(less) regression under general source condition – volume: 5 start-page: 322 year: 1969 ident: mlstad2febbib39 article-title: Visual feature extraction by a multilayered network of analog threshold elements publication-title: IEEE Trans. Syst. Sci. Cybern. doi: 10.1109/TSSC.1969.300225 – volume: vol 7 year: 2004 ident: mlstad2febbib6 – year: 2015 ident: mlstad2febbib14 article-title: Very deep convolutional networks for large-scale image recognition – year: 2021 ident: mlstad2febbib17 article-title: The intrinsic dimension of images and its impact on learning – year: 2015 ident: mlstad2febbib7 – volume: vol 33 start-page: pp 1877 year: 2020 ident: mlstad2febbib4 article-title: Language models are few-shot learners – volume: 45 start-page: 6056 year: 1992 ident: mlstad2febbib9 article-title: Statistical mechanics of learning from examples publication-title: Phys. Rev. A doi: 10.1103/PhysRevA.45.6056 – volume: 3 year: 2022 ident: mlstad2febbib29 article-title: Probing transfer learning with a model of synthetic correlated datasets publication-title: Mach. Learn.: Sci. Technol. doi: 10.1088/2632-2153/ac4f3f – start-page: pp 103 year: 1992 ident: mlstad2febbib43 – year: 2021 ident: mlstad2febbib16 article-title: An image is worth 16x16 words: transformers for image recognition at scale – volume: vol 33 year: 2020 ident: mlstad2febbib21 article-title: When do neural networks outperform kernel methods? – volume: 9 start-page: 1735 year: 1997 ident: mlstad2febbib52 article-title: Long short-term memory publication-title: Neural Comput. doi: 10.1162/neco.1997.9.8.1735 – volume: 121 year: 2018 ident: mlstad2febbib50 article-title: Response functions as quantifiers of non-Markovianity publication-title: Phys. Rev. Lett. doi: 10.1103/PhysRevLett.121.040601 – volume: 74 start-page: 1280 year: 1987 ident: mlstad2febbib51 article-title: Effect of seed dimorphism on the density-dependent dynamics of experimental populations of atriplex triangularis (chenopodiaceae) publication-title: Am. J. Bot. doi: 10.1002/j.1537-2197.1987.tb08741.x – volume: 87 year: 2013 ident: mlstad2febbib36 article-title: Effective heating to several thousand kelvins of an optically trapped sphere in a liquid publication-title: Phys. Rev. E doi: 10.1103/PhysRevE.87.032159 – volume: 99 start-page: 609 year: 2018 ident: mlstad2febbib55 article-title: Linking connectivity, dynamics and computations in low-rank recurrent neural networks publication-title: Neuron doi: 10.1016/j.neuron.2018.07.003 – volume: 22 start-page: 1983 year: 1989 ident: mlstad2febbib8 article-title: Three unfinished works on the optimal storage capacity of networks publication-title: J. Phys. A: Math. Gen. doi: 10.1088/0305-4470/22/12/004 – start-page: pp 426 year: 2022 ident: mlstad2febbib20 article-title: The Gaussian equivalence of generative models for learning with shallow neural networks – year: 2001 ident: mlstad2febbib10 – start-page: pp 770 year: 2016 ident: mlstad2febbib15 article-title: Deep residual learning for image recognition – volume: 102 year: 2020 ident: mlstad2febbib61 article-title: Volterra-series approach to stochastic nonlinear dynamics: linear response of the Van der Pol oscillator driven by white noise publication-title: Phys. Rev. E doi: 10.1103/PhysRevE.102.032209 – volume: 89 year: 2014 ident: mlstad2febbib47 article-title: Canonical form of master equations and characterization of non-Markovianity publication-title: Phys. Rev. A doi: 10.1103/PhysRevA.89.042120 – volume: 23 year: 2021 ident: mlstad2febbib60 article-title: Quantifying entropy production in active fluctuations of the hair-cell bundle from time irreversibility and uncertainty relations publication-title: New J. Phys. doi: 10.1088/1367-2630/ac0f18 – year: 2014 ident: mlstad2febbib41 article-title: Adam: a method for stochastic optimization – volume: 105 year: 2010 ident: mlstad2febbib48 article-title: Entanglement and non-Markovianity of quantum evolutions publication-title: Phys. Rev. Lett. doi: 10.1103/PhysRevLett.105.050403 – volume: 8 start-page: 1 year: 2017 ident: mlstad2febbib64 article-title: Prediction and real-time compensation of qubit decoherence via machine learning publication-title: Nat. Commun. doi: 10.1038/ncomms14106 – volume: 8 year: 2018 ident: mlstad2febbib18 article-title: Classification and geometry of general perceptual manifolds publication-title: Phys. Rev. X doi: 10.1103/PhysRevX.8.031003 – volume: vol 34 year: 2021 ident: mlstad2febbib25 article-title: Learning Gaussian mixtures with generalized linear models: precise asymptotics in high-dimensions – volume: 104 year: 2021 ident: mlstad2febbib49 article-title: Quantifying non-Markovianity via conditional mutual information publication-title: Phys. Rev. A doi: 10.1103/PhysRevA.104.032212 – year: 2019 ident: mlstad2febbib1 article-title: BERT: pre-training of deep bidirectional transformers for language understanding – volume: vol 34 start-page: pp 8506 year: 2021 ident: mlstad2febbib27 article-title: On the interplay between data structure and loss function in classification problems – year: 2019 ident: mlstad2febbib59 article-title: Effective bandwidth of non-Markovian packet traffic publication-title: J. Stat. Mech. doi: 10.1088/1742-5468/ab33fa – year: 2016 ident: mlstad2febbib38 – volume: 61 start-page: 259 year: 1988 ident: mlstad2febbib53 article-title: Chaos in random neural networks publication-title: Phys. Rev. Lett. doi: 10.1103/PhysRevLett.61.259 – volume: 31 start-page: 3423 year: 2018 ident: mlstad2febbib58 article-title: Fast physics and slow physics in the nonlinear Dansgaard–Oeschger relaxation oscillation publication-title: J. Clim. doi: 10.1175/JCLI-D-17-0559.1 – year: 2020 ident: mlstad2febbib26 article-title: Asymptotic learning curves of kernel methods: empirical data versus teacher–student paradigm publication-title: J. Stat. Mech. doi: 10.1088/1742-5468/abc61d – year: 2023 ident: mlstad2febbib5 article-title: Gpt-4 technical report – volume: 51 start-page: 5522 year: 2012 ident: mlstad2febbib35 article-title: Force mapping of an optical trap using an acousto-optical deflector in a time-sharing regime publication-title: Appl. Opt. doi: 10.1364/AO.51.005522 – start-page: pp 8936 year: 2021 ident: mlstad2febbib24 article-title: Classifying high-dimensional Gaussian mixtures: where kernel methods fail and neural networks succeed
SSID	ssj0002513520
Score	2.256002
Snippet	The recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks. While there...
SourceID	doaj proquest crossref iop
SourceType	Open Website Aggregation Database Enrichment Source Index Database Publisher
StartPage	15053
SubjectTerms	Artificial neural networks auto-regressive model Autoregressive models Learning memory Memory tasks Natural language processing Neural networks non-Markovianity Performance degradation Performance enhancement recurrent neural network sequence-to-sequence task statistical physics
SummonAdditionalLinks	– databaseName: DOAJ Directory of Open Access Journals dbid: DOA link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3LSsQwFA0yKze-xdFRstCFi9B2kuaBKxWHQdCVA7MLzUtEnQ5OXfj33vQxKsK4cVdKSsM9Tc695PYchE6NFcEXghEugiSMck-M4IEIAJwNU6HS2gzm7p6PJ-x2mk-_WX3FnrBGHrgJXEJZ6qKonFJQyuSOy4xbYzIgRh6YsibuvsB534qpuAcDa0NmkbbnkrCSkqhLToDfaFK4YfDmBw_Vcv3ALk_l_NeeXBPNaAtttBkivmxmto3W_GwHbXbuC7hdjLvoAhDGzU-OuAz4NbbMfuByhlsjiEfctUmTqiTdNa6KxfNiD01GNw_XY9JaIRALGVNFlOfCFnkeJOUhOAe1n1JGFlIqUaSZLKzLXGaD5I4an_ncSF5rqVEDNYFydB_1ZuXMHyBMqc8ZN5Z5lTIvoKSJmEAmFGQerM36KOkCo22rEx7tKl50fV4tpY6h1DGUugllH50vn5g3Ghkrxl7FWC_HRXXr-gZgrlvM9V-Y99EZIKXb1bZY8bJBh-XX4NqsS3LF0sP_mMsRWh9CntO0pQ1Qr3p798eQp1TmpP4kPwGZWt9g priority: 102 providerName: Directory of Open Access Journals – databaseName: ProQuest Central dbid: BENPR link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1JS8QwFA4uFy_u4k4OevAQpp2kWfAgjswwCIqIgrfSbIOo09GpB_-9L206gwhzK23awnt5W_LyfQidaSO8KwQjXHhJGOWOaME9EaBw1k2ESmoymLt7Pnxmty_ZS1xwm8a2ytYn1o7aliaskXdqjiXJFUuuJp8ksEaF3dVIobGMVsEFSyi-Vnv9-4fH2SoLRG_IMJK4PwkW1Qn45ATiHO0Utuud_hOPath-iDKv5eSfb64DzmATrcdMEV83qt1CS268jTZaFgYcjXIHXYKmcXPYEZcef4TW2R9cjnEkhBjhtl2aVCVpr3FVTN-mu-h50H-6GZJIiUAMZE4VUY4LU2SZl5R7by3UgEppWUipRJGksjA2tanxkluqXeoyLXmNqUY11AbK0j20Mi7Hbh9hSl3GuDbMqYQ5AaVN0A1kRF5m3pj0AHVaweQm4oUH2or3vN63ljIPosyDKPNGlAfoYvbGpMHKWDC2F2Q9GxdQrusb5dcoj0aTU5bYACioFJSxmeUy5UbrFJIi7pky8JFz0FQerW664GfHrS7ng-cT6nDx4yO01oVMpmk8O0Yr1de3O4FMpNKncbr9Al682b8 priority: 102 providerName: ProQuest
Title	The impact of memory on learning sequence-to-sequence tasks
URI	https://iopscience.iop.org/article/10.1088/2632-2153/ad2feb https://www.proquest.com/docview/2973186940 https://doaj.org/article/340d7140992545d6816cbb17106f49cb
Volume	5
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwELb6uPTSAm3VFysf4NCD2WTt2GP1RNEuBakPIar2FsWvHqCbqpseuPDbGTvOIgSquFhW5Ec0X-yZicffEPLGWBV8owSTKgATXHpmlAxMIeBiUihdpGQw5xfy7Fp8vq1uV8jJ8i5M-5C3_ndY7YmCexHmgDgYR4ZxhpqKjxs3Cd6sknUOEqLndclvlj9YUHGjcVHko8l_dfxDFSXGflQwOOtf23LSNbMXZDMbifR9_0ovyYqfvyJbQwIGmtfjNjlBkGl_z5G2gd7HqNkftJ3TnAvijg6R0qxr2VCnXbP4ttgh17Pp1w9nLGdDYBaNpo5pL5VtqioAlyE4h-6f1gYaAK2aooTGutKVNoB03PjSVwZkolPjBt0C7fguWZu3c79HKOe-EtJY4XUhvEKvJsKCxlCAKlhb7pPxIJjaZqrwmLHie52OrAHqKMo6irLuRblPjpc9HnqajGfankZZL9tFguv0AMGuM9g1F4WLXIJaowdbOQmltMaUaA_JILTFQd4iUnVecItnJjsasPzdOOXrAqlFcfCfwxySjQlaM33w2RFZ6x6f_Gu0RjozIqsw-zgi66fTi6svo-TTY_np8grL85_TUfoqfwGU-9zj
linkProvider	IOP Publishing
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwEB6VcoBLeYtCAR_ogYO1Sez4IYQQr2VLH6dW6s3ErwpBN0s3CPVP8RsZO8lWCGlvvUWJk0jz8Hxjj-cDeGmdjKGRnAoZFeVMBGqliFSiwnlVSF1kMpjDIzE74V9O69MN-DOehUllleOcmCdq37q0Rj7JHEtKaF68XfykiTUq7a6OFBq9WeyHy9-Ysi3f7H1E_e5W1fTT8YcZHVgFqEPw0VEdhHRNXUfFRIzeYxqltVWNUlo2Raka50tfuqiEZzaUobZK5LZkzCK81p7hd2_ATc6YTh6lpp9XazqIFRDPFMNuKPrvJHVDpxhV2aTxVQz2n-iXSQIwpn1rF_9Fghzepndha8Cl5F1vSPdgI8zvw52R84EMU8ADeI12RfqjlaSN5DwV6l6Sdk4G-okzMhZn066l4zXpmuX35UM4uRZRPYLNeTsPj4EwFmourONBFzxITKSSJSD-iqqOzpXbMBkFY9zQnTyRZPwweZdcKZNEaZIoTS_KbXi1emPRd-ZYM_Z9kvVqXOqpnW-0F2dmcFHDeOFT-0KtMWmuvVClcNaWCMFE5NrhR3ZRU2bw8eWan-2MurwafGW-T9Y_fgG3ZseHB-Zg72j_KdyuEEP1JW87sNld_ArPEAN19nk2PAJfr9vS_wIe7RVk
linkToPdf	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9wwELYKlRAXoAXEq60P7YGD2WTt-CFOlHZFeZUDCG5W_OqhsEFsOPDvGTvOIgRC3KxobEczsWcmHn8fQt-NFcHXghEugiSMck-M4IEIMDgbFkIViQzm5JQfXLDDq-oq85ymuzDNbd76d6DZAQV3KswFcXIQEcYJeCo6qN0wxNJpF2bQx4pyHrkb_tLL6U8WcN4QYBT5ePK1zs_cUULtBycDM7_YmpO_GS2hhRwo4r3utT6hD378GS32JAw4r8lltAuGxt1dR9wEfBMrZx9wM8aZD-If7qulSduQvo3bevJ_soIuRr_P9w9IZkQgFgKnlijPha2rKkjKQ3AOUkCljKylVKIuSllbV7rSBskdNb70lZE8QapRA6mBcnQVzY6bsV9DmFJfMW4s86pgXkBmE00DAVGQVbC2XEeDXjHaZrjwyFpxrdOxtZQ6qlJHVepOletoe9rjtoPKeEP2Z9T1VC6CXKcHYHCdDa4pK1zEE1QKstjKcVlya0wJMREPTFkY5AdYSudFN3ljsq3elk_CibNLcsWKjXcO8w3Nnf0a6eM_p0ebaH4IwU1Xi7aFZtu7e_8FgpPWfE0f4CO1ANsj
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+impact+of+memory+on+learning+sequence-to-sequence+tasks&rft.jtitle=Machine+learning%3A+science+and+technology&rft.au=Seif%2C+Alireza&rft.au=Loos%2C+Sarah+A+M&rft.au=Tucci%2C+Gennaro&rft.au=Rold%C3%A1n%2C+%C3%89dgar&rft.date=2024-03-01&rft.pub=IOP+Publishing&rft.eissn=2632-2153&rft.volume=5&rft.issue=1&rft_id=info:doi/10.1088%2F2632-2153%2Fad2feb&rft.externalDocID=mlstad2feb
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2632-2153&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2632-2153&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2632-2153&client=summon