The impact of memory on learning sequence-to-sequence tasks

The recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks. While there exists a rich literature that studies classification and regression tasks using solvable models of neural networks, seq2seq tasks have not ye...

Full description

Saved in:
Bibliographic Details
Published inMachine learning: science and technology Vol. 5; no. 1; pp. 15053 - 15068
Main Authors Seif, Alireza, Loos, Sarah A M, Tucci, Gennaro, Roldán, Édgar, Goldt, Sebastian
Format Journal Article
LanguageEnglish
Published Bristol IOP Publishing 01.03.2024
Subjects
Online AccessGet full text

Cover

Loading…
Abstract The recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks. While there exists a rich literature that studies classification and regression tasks using solvable models of neural networks, seq2seq tasks have not yet been studied from this perspective. Here, we propose a simple model for a seq2seq task that has the advantage of providing explicit control over the degree of memory, or non-Markovianity, in the sequences—the stochastic switching-Ornstein–Uhlenbeck (SSOU) model. We introduce a measure of non-Markovianity to quantify the amount of memory in the sequences. For a minimal auto-regressive (AR) learning model trained on this task, we identify two learning regimes corresponding to distinct phases in the stationary state of the SSOU process. These phases emerge from the interplay between two different time scales that govern the sequence statistics. Moreover, we observe that while increasing the integration window of the AR model always improves performance, albeit with diminishing returns, increasing the non-Markovianity of the input sequences can improve or degrade its performance. Finally, we perform experiments with recurrent and convolutional neural networks that show that our observations carry over to more complicated neural network architectures.
AbstractList The recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks. While there exists a rich literature that studies classification and regression tasks using solvable models of neural networks, seq2seq tasks have not yet been studied from this perspective. Here, we propose a simple model for a seq2seq task that has the advantage of providing explicit control over the degree of memory, or non-Markovianity, in the sequences—the stochastic switching-Ornstein–Uhlenbeck (SSOU) model. We introduce a measure of non-Markovianity to quantify the amount of memory in the sequences. For a minimal auto-regressive (AR) learning model trained on this task, we identify two learning regimes corresponding to distinct phases in the stationary state of the SSOU process. These phases emerge from the interplay between two different time scales that govern the sequence statistics. Moreover, we observe that while increasing the integration window of the AR model always improves performance, albeit with diminishing returns, increasing the non-Markovianity of the input sequences can improve or degrade its performance. Finally, we perform experiments with recurrent and convolutional neural networks that show that our observations carry over to more complicated neural network architectures.
Author Loos, Sarah A M
Goldt, Sebastian
Seif, Alireza
Roldán, Édgar
Tucci, Gennaro
Author_xml – sequence: 1
  givenname: Alireza
  orcidid: 0000-0001-5419-5999
  surname: Seif
  fullname: Seif, Alireza
  organization: Pritzker School of Molecular Engineering, University of Chicago , Chicago, IL 60637, United States of America
– sequence: 2
  givenname: Sarah A M
  orcidid: 0000-0002-5946-5684
  surname: Loos
  fullname: Loos, Sarah A M
  organization: University of Cambridge DAMTP, Centre for Mathematical Sciences, Cambridge CB3 0WA, United Kingdom
– sequence: 3
  givenname: Gennaro
  surname: Tucci
  fullname: Tucci, Gennaro
  organization: Max Planck Institute for Dynamics and Self-Organization , Göttingen, Germany
– sequence: 4
  givenname: Édgar
  orcidid: 0000-0001-7196-8404
  surname: Roldán
  fullname: Roldán, Édgar
  organization: ICTP—The Abdus Salam International Centre for Theoretical Physics , Trieste, Italy
– sequence: 5
  givenname: Sebastian
  orcidid: 0000-0002-5799-7644
  surname: Goldt
  fullname: Goldt, Sebastian
  organization: International School of Advanced Studies (SISSA) , Trieste, Italy
BookMark eNp9UEtLAzEYDFLBWnv3uODVtXnsZhM8SfFRKHip55DNo6bubmoSD_33bl2rIujpG4aZYb45BaPOdwaAcwSvEGRshinBOUYlmUmNramPwPiLGv3AJ2Aa4wZCiEtESgzH4Hr1bDLXbqVKmbdZa1ofdpnvssbI0LlunUXz-mY6ZfLk8wPOkowv8QwcW9lEM_28E_B0d7uaP-TLx_vF_GaZqwLRlHNDKyXL0jJCrdWaV4zzmknGeCUhYlJppJGyjGpSG2TKmlFUYEZITSDlmkzAYsjVXm7ENrhWhp3w0okPwoe1kCE51RhBCqgrVEDOcVmUmjJEVV2jCkFqC67qPutiyNoG3_8Sk9j4t9D19QXmFUGM8gL2KjqoVPAxBmOFckkm57sUpGsEgmK_u9gPK_bDimH33gh_GQ91_7FcDhbnt99l_pS_A4-Mk_8
CODEN MLSTCK
CitedBy_id crossref_primary_10_1103_PhysRevResearch_6_023057
Cites_doi 10.1103/PhysRevLett.129.030603
10.1103/PhysRevLett.127.198101
10.1063/1.4986932
10.1103/PhysRevE.96.012101
10.1103/RevModPhys.91.045002
10.1016/j.neuron.2009.07.018
10.1038/s41567-019-0445-4
10.1038/s41534-020-0251-y
10.1103/PhysRevX.10.041044
10.1103/PhysRevA.81.062115
10.1073/pnas.2018422118
10.1103/PhysRevResearch.3.L022018
10.1162/neco.1989.1.4.541
10.3389/fphy.2019.00182
10.1109/TSSC.1969.300225
10.1103/PhysRevA.45.6056
10.1088/2632-2153/ac4f3f
10.1162/neco.1997.9.8.1735
10.1103/PhysRevLett.121.040601
10.1002/j.1537-2197.1987.tb08741.x
10.1103/PhysRevE.87.032159
10.1016/j.neuron.2018.07.003
10.1088/0305-4470/22/12/004
10.1103/PhysRevE.102.032209
10.1103/PhysRevA.89.042120
10.1088/1367-2630/ac0f18
10.1103/PhysRevLett.105.050403
10.1038/ncomms14106
10.1103/PhysRevX.8.031003
10.1103/PhysRevA.104.032212
10.1088/1742-5468/ab33fa
10.1103/PhysRevLett.61.259
10.1175/JCLI-D-17-0559.1
10.1088/1742-5468/abc61d
10.1364/AO.51.005522
ContentType Journal Article
Copyright 2024 The Author(s). Published by IOP Publishing Ltd
2024 The Author(s). Published by IOP Publishing Ltd. This work is published under http://creativecommons.org/licenses/by/4.0 (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2024 The Author(s). Published by IOP Publishing Ltd
– notice: 2024 The Author(s). Published by IOP Publishing Ltd. This work is published under http://creativecommons.org/licenses/by/4.0 (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID O3W
TSCCA
AAYXX
CITATION
3V.
7XB
88I
8FE
8FG
8FK
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
GNUQQ
HCIFZ
JQ2
K7-
M2P
P62
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
Q9U
DOA
DOI 10.1088/2632-2153/ad2feb
DatabaseName Institute of Physics Open Access Journal Titles
IOPscience (Open Access)
CrossRef
ProQuest Central (Corporate)
ProQuest Central (purchase pre-March 2016)
Science Database (Alumni Edition)
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Central
Technology Collection
ProQuest One Community College
ProQuest Central
ProQuest Central Student
ProQuest SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database (Proquest)
Science Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central Basic
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
Publicly Available Content Database
Computer Science Database
ProQuest Central Student
Technology Collection
ProQuest One Academic Middle East (New)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Central Korea
ProQuest Central (New)
Advanced Technologies & Aerospace Collection
ProQuest Science Journals (Alumni Edition)
ProQuest Central Basic
ProQuest Science Journals
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
ProQuest SciTech Collection
ProQuest One Academic UKI Edition
ProQuest One Academic
ProQuest One Academic (New)
ProQuest Central (Alumni)
DatabaseTitleList
CrossRef
Publicly Available Content Database
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: O3W
  name: Institute of Physics Open Access Journal Titles
  url: http://iopscience.iop.org/
  sourceTypes: Publisher
– sequence: 3
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 2632-2153
ExternalDocumentID oai_doaj_org_article_340d7140992545d6816cbb17106f49cb
10_1088_2632_2153_ad2feb
mlstad2feb
GrantInformation_xml – fundername: Chicago Prize Postdoctoral Fellowship
GroupedDBID 88I
ABHWH
ABUWG
ACHIP
AFKRA
AKPSB
ALMA_UNASSIGNED_HOLDINGS
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
CJUJL
DWQXO
EBS
GNUQQ
GROUPED_DOAJ
HCIFZ
IOP
K7-
M2P
M~E
N5L
O3W
OK1
PIMPY
TSCCA
AAYXX
CITATION
PHGZM
PHGZT
3V.
7XB
8FE
8FG
8FK
JQ2
P62
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
Q9U
AEINN
PUEGO
ID FETCH-LOGICAL-c416t-9e67ca55f836ffdd97899b8a8897a018acd1d1cf86d3be1e5b86142833b3069d3
IEDL.DBID O3W
ISSN 2632-2153
IngestDate Wed Aug 27 01:25:24 EDT 2025
Sun Jul 13 03:04:47 EDT 2025
Tue Jul 01 01:08:57 EDT 2025
Thu Apr 24 23:07:59 EDT 2025
Sun Aug 18 16:10:27 EDT 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c416t-9e67ca55f836ffdd97899b8a8897a018acd1d1cf86d3be1e5b86142833b3069d3
Notes MLST-101241.R1
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-5946-5684
0000-0002-5799-7644
0000-0001-7196-8404
0000-0001-5419-5999
OpenAccessLink https://iopscience.iop.org/article/10.1088/2632-2153/ad2feb
PQID 2973186940
PQPubID 4916454
PageCount 16
ParticipantIDs doaj_primary_oai_doaj_org_article_340d7140992545d6816cbb17106f49cb
crossref_citationtrail_10_1088_2632_2153_ad2feb
crossref_primary_10_1088_2632_2153_ad2feb
proquest_journals_2973186940
iop_journals_10_1088_2632_2153_ad2feb
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2024-03-01
PublicationDateYYYYMMDD 2024-03-01
PublicationDate_xml – month: 03
  year: 2024
  text: 2024-03-01
  day: 01
PublicationDecade 2020
PublicationPlace Bristol
PublicationPlace_xml – name: Bristol
PublicationTitle Machine learning: science and technology
PublicationTitleAbbrev MLST
PublicationTitleAlternate Mach. Learn.: Sci. Technol
PublicationYear 2024
Publisher IOP Publishing
Publisher_xml – name: IOP Publishing
References Dosovitskiy (mlstad2febbib16) 2021
Brown (mlstad2febbib4) 2020; vol 33
Engel (mlstad2febbib10) 2001
Lapolla (mlstad2febbib45) 2019; 7
Gardner (mlstad2febbib8) 1989; 22
Brückner (mlstad2febbib62) 2019; 15
Cho (mlstad2febbib40) 2014
Seung (mlstad2febbib9) 1992; 45
Hall (mlstad2febbib47) 2014; 89
Majumder (mlstad2febbib65) 2020; 6
Martinez (mlstad2febbib35) 2012; 51
Huang (mlstad2febbib49) 2021; 104
Pope (mlstad2febbib17) 2021
Sussillo (mlstad2febbib54) 2009; 63
Goodfellow (mlstad2febbib38) 2016
Glorot (mlstad2febbib66) 2010
Richards (mlstad2febbib22) 2021
Vettoretti (mlstad2febbib58) 2018; 31
Goldt (mlstad2febbib19) 2020; 10
Kantz (mlstad2febbib6) 2004; vol 7
Benna (mlstad2febbib28) 2021; 118
Mindlin (mlstad2febbib57) 2017; 27
Belousov (mlstad2febbib61) 2020; 102
Lapolla (mlstad2febbib44) 2021; 3
Rivas (mlstad2febbib48) 2010; 105
Strasberg (mlstad2febbib50) 2018; 121
Ghorbani (mlstad2febbib30) 2019; vol 32
Mavadia (mlstad2febbib64) 2017; 8
Roldán (mlstad2febbib60) 2021; 23
Howard (mlstad2febbib2) 2018
Cavallaro (mlstad2febbib59) 2019
Kingma (mlstad2febbib41) 2014
Goldt (mlstad2febbib20) 2022
Fukushima (mlstad2febbib39) 1969; 5
LeCun (mlstad2febbib37) 1989; 1
Martínez (mlstad2febbib36) 2013; 87
OpenAI (mlstad2febbib5) 2023
Mezard (mlstad2febbib11) 2009
Sompolinsky (mlstad2febbib53) 1988; 61
Gerace (mlstad2febbib29) 2022; 3
Van Kampen (mlstad2febbib34) 1992
Pietzonka (mlstad2febbib32) 2017; 96
Devlin (mlstad2febbib1) 2019
Chizat (mlstad2febbib23) 2020
Spigler (mlstad2febbib26) 2020
Seif (mlstad2febbib42) 2022
Laine (mlstad2febbib46) 2010; 81
Refinetti (mlstad2febbib24) 2021
Simonyan (mlstad2febbib14) 2015
Box (mlstad2febbib7) 2015
Chung (mlstad2febbib18) 2018; 8
Radford (mlstad2febbib3) 2018
Ghorbani (mlstad2febbib21) 2020; vol 33
Krizhevsky (mlstad2febbib13) 2012
He (mlstad2febbib15) 2016
Di Terlizzi (mlstad2febbib33) 2023
Loureiro (mlstad2febbib25) 2021; vol 34
Mastrogiuseppe (mlstad2febbib55) 2018; 99
Tucci (mlstad2febbib31) 2022; 129
Skinner (mlstad2febbib63) 2021; 127
Hochreiter (mlstad2febbib52) 1997; 9
Ellison (mlstad2febbib51) 1987; 74
Kloeden (mlstad2febbib43) 1992
d’Ascoli (mlstad2febbib27) 2021; vol 34
Carleo (mlstad2febbib12) 2019; 91
Schuessler (mlstad2febbib56) 2020; vol 33
References_xml – volume: 129
  year: 2022
  ident: mlstad2febbib31
  article-title: Modeling active non-Markovian oscillations
  publication-title: Phys. Rev. Lett.
  doi: 10.1103/PhysRevLett.129.030603
– volume: 127
  year: 2021
  ident: mlstad2febbib63
  article-title: Estimating entropy production from waiting time distributions
  publication-title: Phys. Rev. Lett.
  doi: 10.1103/PhysRevLett.127.198101
– volume: 27
  year: 2017
  ident: mlstad2febbib57
  article-title: Nonlinear dynamics in the study of birdsong
  publication-title: Chaos
  doi: 10.1063/1.4986932
– volume: 96
  year: 2017
  ident: mlstad2febbib32
  article-title: Finite-time generalization of the thermodynamic uncertainty relation
  publication-title: Phys. Rev. E
  doi: 10.1103/PhysRevE.96.012101
– year: 1992
  ident: mlstad2febbib34
– volume: 91
  year: 2019
  ident: mlstad2febbib12
  article-title: Machine learning and the physical sciences
  publication-title: Rev. Mod. Phys.
  doi: 10.1103/RevModPhys.91.045002
– volume: 63
  start-page: 544
  year: 2009
  ident: mlstad2febbib54
  article-title: Generating coherent patterns of activity from chaotic neural networks
  publication-title: Neuron
  doi: 10.1016/j.neuron.2009.07.018
– start-page: pp 103
  year: 2014
  ident: mlstad2febbib40
  article-title: On the properties of neural machine translation: encoder–decoder approaches
– volume: vol 33
  start-page: pp 13352
  year: 2020
  ident: mlstad2febbib56
  article-title: The interplay between randomness and structure during learning in RNNs
– volume: 15
  start-page: 595
  year: 2019
  ident: mlstad2febbib62
  article-title: Stochastic nonlinear dynamics of confined cell migration in two-state systems
  publication-title: Nat. Phys.
  doi: 10.1038/s41567-019-0445-4
– year: 2009
  ident: mlstad2febbib11
– year: 2018
  ident: mlstad2febbib2
– volume: 6
  start-page: 1
  year: 2020
  ident: mlstad2febbib65
  article-title: Real-time calibration with spectator qubits
  publication-title: npj Quantum Inf.
  doi: 10.1038/s41534-020-0251-y
– volume: 10
  year: 2020
  ident: mlstad2febbib19
  article-title: Modeling the influence of data structure on learning in neural networks: the hidden manifold model
  publication-title: Phys. Rev. X
  doi: 10.1103/PhysRevX.10.041044
– start-page: pp 1305
  year: 2020
  ident: mlstad2febbib23
  article-title: Implicit bias of gradient descent for wide two-layer neural networks trained with the logistic loss
– start-page: pp 249
  year: 2010
  ident: mlstad2febbib66
  article-title: Understanding the difficulty of training deep feedforward neural networks
– volume: vol 32
  start-page: pp 9111
  year: 2019
  ident: mlstad2febbib30
  article-title: Limitations of lazy training of two-layers neural network
– year: 2018
  ident: mlstad2febbib3
  article-title: Improving language understanding by generative pre-training
– year: 2022
  ident: mlstad2febbib42
  article-title: Code for data generation and training and testing machine learning models
– volume: 81
  year: 2010
  ident: mlstad2febbib46
  article-title: Measure for the non-Markovianity of quantum processes
  publication-title: Phys. Rev. A
  doi: 10.1103/PhysRevA.81.062115
– volume: 118
  year: 2021
  ident: mlstad2febbib28
  article-title: Place cells may simply be memory cells: memory compression leads to spatial tuning and history dependence
  publication-title: Proc. Natl Acad. Sci.
  doi: 10.1073/pnas.2018422118
– year: 2023
  ident: mlstad2febbib33
  article-title: Variance sum rule for entropy production
– volume: 3
  year: 2021
  ident: mlstad2febbib44
  article-title: Toolbox for quantifying memory in dynamics along reaction coordinates
  publication-title: Phys. Rev. Res.
  doi: 10.1103/PhysRevResearch.3.L022018
– start-page: pp 1097
  year: 2012
  ident: mlstad2febbib13
  article-title: ImageNet classification with deep convolutional neural networks
– volume: 1
  start-page: 541
  year: 1989
  ident: mlstad2febbib37
  article-title: Backpropagation applied to handwritten zip code recognition
  publication-title: Neural Comput.
  doi: 10.1162/neco.1989.1.4.541
– volume: 7
  start-page: 182
  year: 2019
  ident: mlstad2febbib45
  article-title: Manifestations of projection-induced memory: general theory and the tilted single file
  publication-title: Front. Phys.
  doi: 10.3389/fphy.2019.00182
– start-page: pp 3889
  year: 2021
  ident: mlstad2febbib22
  article-title: Asymptotics of ridge(less) regression under general source condition
– volume: 5
  start-page: 322
  year: 1969
  ident: mlstad2febbib39
  article-title: Visual feature extraction by a multilayered network of analog threshold elements
  publication-title: IEEE Trans. Syst. Sci. Cybern.
  doi: 10.1109/TSSC.1969.300225
– volume: vol 7
  year: 2004
  ident: mlstad2febbib6
– year: 2015
  ident: mlstad2febbib14
  article-title: Very deep convolutional networks for large-scale image recognition
– year: 2021
  ident: mlstad2febbib17
  article-title: The intrinsic dimension of images and its impact on learning
– year: 2015
  ident: mlstad2febbib7
– volume: vol 33
  start-page: pp 1877
  year: 2020
  ident: mlstad2febbib4
  article-title: Language models are few-shot learners
– volume: 45
  start-page: 6056
  year: 1992
  ident: mlstad2febbib9
  article-title: Statistical mechanics of learning from examples
  publication-title: Phys. Rev. A
  doi: 10.1103/PhysRevA.45.6056
– volume: 3
  year: 2022
  ident: mlstad2febbib29
  article-title: Probing transfer learning with a model of synthetic correlated datasets
  publication-title: Mach. Learn.: Sci. Technol.
  doi: 10.1088/2632-2153/ac4f3f
– start-page: pp 103
  year: 1992
  ident: mlstad2febbib43
– year: 2021
  ident: mlstad2febbib16
  article-title: An image is worth 16x16 words: transformers for image recognition at scale
– volume: vol 33
  year: 2020
  ident: mlstad2febbib21
  article-title: When do neural networks outperform kernel methods?
– volume: 9
  start-page: 1735
  year: 1997
  ident: mlstad2febbib52
  article-title: Long short-term memory
  publication-title: Neural Comput.
  doi: 10.1162/neco.1997.9.8.1735
– volume: 121
  year: 2018
  ident: mlstad2febbib50
  article-title: Response functions as quantifiers of non-Markovianity
  publication-title: Phys. Rev. Lett.
  doi: 10.1103/PhysRevLett.121.040601
– volume: 74
  start-page: 1280
  year: 1987
  ident: mlstad2febbib51
  article-title: Effect of seed dimorphism on the density-dependent dynamics of experimental populations of atriplex triangularis (chenopodiaceae)
  publication-title: Am. J. Bot.
  doi: 10.1002/j.1537-2197.1987.tb08741.x
– volume: 87
  year: 2013
  ident: mlstad2febbib36
  article-title: Effective heating to several thousand kelvins of an optically trapped sphere in a liquid
  publication-title: Phys. Rev. E
  doi: 10.1103/PhysRevE.87.032159
– volume: 99
  start-page: 609
  year: 2018
  ident: mlstad2febbib55
  article-title: Linking connectivity, dynamics and computations in low-rank recurrent neural networks
  publication-title: Neuron
  doi: 10.1016/j.neuron.2018.07.003
– volume: 22
  start-page: 1983
  year: 1989
  ident: mlstad2febbib8
  article-title: Three unfinished works on the optimal storage capacity of networks
  publication-title: J. Phys. A: Math. Gen.
  doi: 10.1088/0305-4470/22/12/004
– start-page: pp 426
  year: 2022
  ident: mlstad2febbib20
  article-title: The Gaussian equivalence of generative models for learning with shallow neural networks
– year: 2001
  ident: mlstad2febbib10
– start-page: pp 770
  year: 2016
  ident: mlstad2febbib15
  article-title: Deep residual learning for image recognition
– volume: 102
  year: 2020
  ident: mlstad2febbib61
  article-title: Volterra-series approach to stochastic nonlinear dynamics: linear response of the Van der Pol oscillator driven by white noise
  publication-title: Phys. Rev. E
  doi: 10.1103/PhysRevE.102.032209
– volume: 89
  year: 2014
  ident: mlstad2febbib47
  article-title: Canonical form of master equations and characterization of non-Markovianity
  publication-title: Phys. Rev. A
  doi: 10.1103/PhysRevA.89.042120
– volume: 23
  year: 2021
  ident: mlstad2febbib60
  article-title: Quantifying entropy production in active fluctuations of the hair-cell bundle from time irreversibility and uncertainty relations
  publication-title: New J. Phys.
  doi: 10.1088/1367-2630/ac0f18
– year: 2014
  ident: mlstad2febbib41
  article-title: Adam: a method for stochastic optimization
– volume: 105
  year: 2010
  ident: mlstad2febbib48
  article-title: Entanglement and non-Markovianity of quantum evolutions
  publication-title: Phys. Rev. Lett.
  doi: 10.1103/PhysRevLett.105.050403
– volume: 8
  start-page: 1
  year: 2017
  ident: mlstad2febbib64
  article-title: Prediction and real-time compensation of qubit decoherence via machine learning
  publication-title: Nat. Commun.
  doi: 10.1038/ncomms14106
– volume: 8
  year: 2018
  ident: mlstad2febbib18
  article-title: Classification and geometry of general perceptual manifolds
  publication-title: Phys. Rev. X
  doi: 10.1103/PhysRevX.8.031003
– volume: vol 34
  year: 2021
  ident: mlstad2febbib25
  article-title: Learning Gaussian mixtures with generalized linear models: precise asymptotics in high-dimensions
– volume: 104
  year: 2021
  ident: mlstad2febbib49
  article-title: Quantifying non-Markovianity via conditional mutual information
  publication-title: Phys. Rev. A
  doi: 10.1103/PhysRevA.104.032212
– year: 2019
  ident: mlstad2febbib1
  article-title: BERT: pre-training of deep bidirectional transformers for language understanding
– volume: vol 34
  start-page: pp 8506
  year: 2021
  ident: mlstad2febbib27
  article-title: On the interplay between data structure and loss function in classification problems
– year: 2019
  ident: mlstad2febbib59
  article-title: Effective bandwidth of non-Markovian packet traffic
  publication-title: J. Stat. Mech.
  doi: 10.1088/1742-5468/ab33fa
– year: 2016
  ident: mlstad2febbib38
– volume: 61
  start-page: 259
  year: 1988
  ident: mlstad2febbib53
  article-title: Chaos in random neural networks
  publication-title: Phys. Rev. Lett.
  doi: 10.1103/PhysRevLett.61.259
– volume: 31
  start-page: 3423
  year: 2018
  ident: mlstad2febbib58
  article-title: Fast physics and slow physics in the nonlinear Dansgaard–Oeschger relaxation oscillation
  publication-title: J. Clim.
  doi: 10.1175/JCLI-D-17-0559.1
– year: 2020
  ident: mlstad2febbib26
  article-title: Asymptotic learning curves of kernel methods: empirical data versus teacher–student paradigm
  publication-title: J. Stat. Mech.
  doi: 10.1088/1742-5468/abc61d
– year: 2023
  ident: mlstad2febbib5
  article-title: Gpt-4 technical report
– volume: 51
  start-page: 5522
  year: 2012
  ident: mlstad2febbib35
  article-title: Force mapping of an optical trap using an acousto-optical deflector in a time-sharing regime
  publication-title: Appl. Opt.
  doi: 10.1364/AO.51.005522
– start-page: pp 8936
  year: 2021
  ident: mlstad2febbib24
  article-title: Classifying high-dimensional Gaussian mixtures: where kernel methods fail and neural networks succeed
SSID ssj0002513520
Score 2.256002
Snippet The recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks. While there...
SourceID doaj
proquest
crossref
iop
SourceType Open Website
Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 15053
SubjectTerms Artificial neural networks
auto-regressive model
Autoregressive models
Learning
memory
Memory tasks
Natural language processing
Neural networks
non-Markovianity
Performance degradation
Performance enhancement
recurrent neural network
sequence-to-sequence task
statistical physics
SummonAdditionalLinks – databaseName: DOAJ Directory of Open Access Journals
  dbid: DOA
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3LSsQwFA0yKze-xdFRstCFi9B2kuaBKxWHQdCVA7MLzUtEnQ5OXfj33vQxKsK4cVdKSsM9Tc695PYchE6NFcEXghEugiSMck-M4IEIAJwNU6HS2gzm7p6PJ-x2mk-_WX3FnrBGHrgJXEJZ6qKonFJQyuSOy4xbYzIgRh6YsibuvsB534qpuAcDa0NmkbbnkrCSkqhLToDfaFK4YfDmBw_Vcv3ALk_l_NeeXBPNaAtttBkivmxmto3W_GwHbXbuC7hdjLvoAhDGzU-OuAz4NbbMfuByhlsjiEfctUmTqiTdNa6KxfNiD01GNw_XY9JaIRALGVNFlOfCFnkeJOUhOAe1n1JGFlIqUaSZLKzLXGaD5I4an_ncSF5rqVEDNYFydB_1ZuXMHyBMqc8ZN5Z5lTIvoKSJmEAmFGQerM36KOkCo22rEx7tKl50fV4tpY6h1DGUugllH50vn5g3Ghkrxl7FWC_HRXXr-gZgrlvM9V-Y99EZIKXb1bZY8bJBh-XX4NqsS3LF0sP_mMsRWh9CntO0pQ1Qr3p798eQp1TmpP4kPwGZWt9g
  priority: 102
  providerName: Directory of Open Access Journals
– databaseName: ProQuest Central
  dbid: BENPR
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1JS8QwFA4uFy_u4k4OevAQpp2kWfAgjswwCIqIgrfSbIOo09GpB_-9L206gwhzK23awnt5W_LyfQidaSO8KwQjXHhJGOWOaME9EaBw1k2ESmoymLt7Pnxmty_ZS1xwm8a2ytYn1o7aliaskXdqjiXJFUuuJp8ksEaF3dVIobGMVsEFSyi-Vnv9-4fH2SoLRG_IMJK4PwkW1Qn45ATiHO0Utuud_hOPath-iDKv5eSfb64DzmATrcdMEV83qt1CS268jTZaFgYcjXIHXYKmcXPYEZcef4TW2R9cjnEkhBjhtl2aVCVpr3FVTN-mu-h50H-6GZJIiUAMZE4VUY4LU2SZl5R7by3UgEppWUipRJGksjA2tanxkluqXeoyLXmNqUY11AbK0j20Mi7Hbh9hSl3GuDbMqYQ5AaVN0A1kRF5m3pj0AHVaweQm4oUH2or3vN63ljIPosyDKPNGlAfoYvbGpMHKWDC2F2Q9GxdQrusb5dcoj0aTU5bYACioFJSxmeUy5UbrFJIi7pky8JFz0FQerW664GfHrS7ng-cT6nDx4yO01oVMpmk8O0Yr1de3O4FMpNKncbr9Al682b8
  priority: 102
  providerName: ProQuest
Title The impact of memory on learning sequence-to-sequence tasks
URI https://iopscience.iop.org/article/10.1088/2632-2153/ad2feb
https://www.proquest.com/docview/2973186940
https://doaj.org/article/340d7140992545d6816cbb17106f49cb
Volume 5
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwELb6uPTSAm3VFysf4NCD2WTt2GP1RNEuBakPIar2FsWvHqCbqpseuPDbGTvOIgSquFhW5Ec0X-yZicffEPLGWBV8owSTKgATXHpmlAxMIeBiUihdpGQw5xfy7Fp8vq1uV8jJ8i5M-5C3_ndY7YmCexHmgDgYR4ZxhpqKjxs3Cd6sknUOEqLndclvlj9YUHGjcVHko8l_dfxDFSXGflQwOOtf23LSNbMXZDMbifR9_0ovyYqfvyJbQwIGmtfjNjlBkGl_z5G2gd7HqNkftJ3TnAvijg6R0qxr2VCnXbP4ttgh17Pp1w9nLGdDYBaNpo5pL5VtqioAlyE4h-6f1gYaAK2aooTGutKVNoB03PjSVwZkolPjBt0C7fguWZu3c79HKOe-EtJY4XUhvEKvJsKCxlCAKlhb7pPxIJjaZqrwmLHie52OrAHqKMo6irLuRblPjpc9HnqajGfankZZL9tFguv0AMGuM9g1F4WLXIJaowdbOQmltMaUaA_JILTFQd4iUnVecItnJjsasPzdOOXrAqlFcfCfwxySjQlaM33w2RFZ6x6f_Gu0RjozIqsw-zgi66fTi6svo-TTY_np8grL85_TUfoqfwGU-9zj
linkProvider IOP Publishing
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwEB6VcoBLeYtCAR_ogYO1Sez4IYQQr2VLH6dW6s3ErwpBN0s3CPVP8RsZO8lWCGlvvUWJk0jz8Hxjj-cDeGmdjKGRnAoZFeVMBGqliFSiwnlVSF1kMpjDIzE74V9O69MN-DOehUllleOcmCdq37q0Rj7JHEtKaF68XfykiTUq7a6OFBq9WeyHy9-Ysi3f7H1E_e5W1fTT8YcZHVgFqEPw0VEdhHRNXUfFRIzeYxqltVWNUlo2Raka50tfuqiEZzaUobZK5LZkzCK81p7hd2_ATc6YTh6lpp9XazqIFRDPFMNuKPrvJHVDpxhV2aTxVQz2n-iXSQIwpn1rF_9Fghzepndha8Cl5F1vSPdgI8zvw52R84EMU8ADeI12RfqjlaSN5DwV6l6Sdk4G-okzMhZn066l4zXpmuX35UM4uRZRPYLNeTsPj4EwFmourONBFzxITKSSJSD-iqqOzpXbMBkFY9zQnTyRZPwweZdcKZNEaZIoTS_KbXi1emPRd-ZYM_Z9kvVqXOqpnW-0F2dmcFHDeOFT-0KtMWmuvVClcNaWCMFE5NrhR3ZRU2bw8eWan-2MurwafGW-T9Y_fgG3ZseHB-Zg72j_KdyuEEP1JW87sNld_ArPEAN19nk2PAJfr9vS_wIe7RVk
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9wwELYKlRAXoAXEq60P7YGD2WTt-CFOlHZFeZUDCG5W_OqhsEFsOPDvGTvOIgRC3KxobEczsWcmHn8fQt-NFcHXghEugiSMck-M4IEIMDgbFkIViQzm5JQfXLDDq-oq85ymuzDNbd76d6DZAQV3KswFcXIQEcYJeCo6qN0wxNJpF2bQx4pyHrkb_tLL6U8WcN4QYBT5ePK1zs_cUULtBycDM7_YmpO_GS2hhRwo4r3utT6hD378GS32JAw4r8lltAuGxt1dR9wEfBMrZx9wM8aZD-If7qulSduQvo3bevJ_soIuRr_P9w9IZkQgFgKnlijPha2rKkjKQ3AOUkCljKylVKIuSllbV7rSBskdNb70lZE8QapRA6mBcnQVzY6bsV9DmFJfMW4s86pgXkBmE00DAVGQVbC2XEeDXjHaZrjwyFpxrdOxtZQ6qlJHVepOletoe9rjtoPKeEP2Z9T1VC6CXKcHYHCdDa4pK1zEE1QKstjKcVlya0wJMREPTFkY5AdYSudFN3ljsq3elk_CibNLcsWKjXcO8w3Nnf0a6eM_p0ebaH4IwU1Xi7aFZtu7e_8FgpPWfE0f4CO1ANsj
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+impact+of+memory+on+learning+sequence-to-sequence+tasks&rft.jtitle=Machine+learning%3A+science+and+technology&rft.au=Seif%2C+Alireza&rft.au=Loos%2C+Sarah+A+M&rft.au=Tucci%2C+Gennaro&rft.au=Rold%C3%A1n%2C+%C3%89dgar&rft.date=2024-03-01&rft.pub=IOP+Publishing&rft.eissn=2632-2153&rft.volume=5&rft.issue=1&rft_id=info:doi/10.1088%2F2632-2153%2Fad2feb&rft.externalDocID=mlstad2feb
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2632-2153&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2632-2153&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2632-2153&client=summon