The impact of memory on learning sequence-to-sequence tasks
The recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks. While there exists a rich literature that studies classification and regression tasks using solvable models of neural networks, seq2seq tasks have not ye...
Saved in:
Published in | Machine learning: science and technology Vol. 5; no. 1; pp. 15053 - 15068 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Bristol
IOP Publishing
01.03.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | The recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks. While there exists a rich literature that studies classification and regression tasks using solvable models of neural networks, seq2seq tasks have not yet been studied from this perspective. Here, we propose a simple model for a seq2seq task that has the advantage of providing explicit control over the degree of memory, or non-Markovianity, in the sequences—the stochastic switching-Ornstein–Uhlenbeck (SSOU) model. We introduce a measure of non-Markovianity to quantify the amount of memory in the sequences. For a minimal auto-regressive (AR) learning model trained on this task, we identify two learning regimes corresponding to distinct phases in the stationary state of the SSOU process. These phases emerge from the interplay between two different time scales that govern the sequence statistics. Moreover, we observe that while increasing the integration window of the AR model always improves performance, albeit with diminishing returns, increasing the non-Markovianity of the input sequences can improve or degrade its performance. Finally, we perform experiments with recurrent and convolutional neural networks that show that our observations carry over to more complicated neural network architectures. |
---|---|
AbstractList | The recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks. While there exists a rich literature that studies classification and regression tasks using solvable models of neural networks, seq2seq tasks have not yet been studied from this perspective. Here, we propose a simple model for a seq2seq task that has the advantage of providing explicit control over the degree of memory, or non-Markovianity, in the sequences—the stochastic switching-Ornstein–Uhlenbeck (SSOU) model. We introduce a measure of non-Markovianity to quantify the amount of memory in the sequences. For a minimal auto-regressive (AR) learning model trained on this task, we identify two learning regimes corresponding to distinct phases in the stationary state of the SSOU process. These phases emerge from the interplay between two different time scales that govern the sequence statistics. Moreover, we observe that while increasing the integration window of the AR model always improves performance, albeit with diminishing returns, increasing the non-Markovianity of the input sequences can improve or degrade its performance. Finally, we perform experiments with recurrent and convolutional neural networks that show that our observations carry over to more complicated neural network architectures. |
Author | Loos, Sarah A M Goldt, Sebastian Seif, Alireza Roldán, Édgar Tucci, Gennaro |
Author_xml | – sequence: 1 givenname: Alireza orcidid: 0000-0001-5419-5999 surname: Seif fullname: Seif, Alireza organization: Pritzker School of Molecular Engineering, University of Chicago , Chicago, IL 60637, United States of America – sequence: 2 givenname: Sarah A M orcidid: 0000-0002-5946-5684 surname: Loos fullname: Loos, Sarah A M organization: University of Cambridge DAMTP, Centre for Mathematical Sciences, Cambridge CB3 0WA, United Kingdom – sequence: 3 givenname: Gennaro surname: Tucci fullname: Tucci, Gennaro organization: Max Planck Institute for Dynamics and Self-Organization , Göttingen, Germany – sequence: 4 givenname: Édgar orcidid: 0000-0001-7196-8404 surname: Roldán fullname: Roldán, Édgar organization: ICTP—The Abdus Salam International Centre for Theoretical Physics , Trieste, Italy – sequence: 5 givenname: Sebastian orcidid: 0000-0002-5799-7644 surname: Goldt fullname: Goldt, Sebastian organization: International School of Advanced Studies (SISSA) , Trieste, Italy |
BookMark | eNp9UEtLAzEYDFLBWnv3uODVtXnsZhM8SfFRKHip55DNo6bubmoSD_33bl2rIujpG4aZYb45BaPOdwaAcwSvEGRshinBOUYlmUmNramPwPiLGv3AJ2Aa4wZCiEtESgzH4Hr1bDLXbqVKmbdZa1ofdpnvssbI0LlunUXz-mY6ZfLk8wPOkowv8QwcW9lEM_28E_B0d7uaP-TLx_vF_GaZqwLRlHNDKyXL0jJCrdWaV4zzmknGeCUhYlJppJGyjGpSG2TKmlFUYEZITSDlmkzAYsjVXm7ENrhWhp3w0okPwoe1kCE51RhBCqgrVEDOcVmUmjJEVV2jCkFqC67qPutiyNoG3_8Sk9j4t9D19QXmFUGM8gL2KjqoVPAxBmOFckkm57sUpGsEgmK_u9gPK_bDimH33gh_GQ91_7FcDhbnt99l_pS_A4-Mk_8 |
CODEN | MLSTCK |
CitedBy_id | crossref_primary_10_1103_PhysRevResearch_6_023057 |
Cites_doi | 10.1103/PhysRevLett.129.030603 10.1103/PhysRevLett.127.198101 10.1063/1.4986932 10.1103/PhysRevE.96.012101 10.1103/RevModPhys.91.045002 10.1016/j.neuron.2009.07.018 10.1038/s41567-019-0445-4 10.1038/s41534-020-0251-y 10.1103/PhysRevX.10.041044 10.1103/PhysRevA.81.062115 10.1073/pnas.2018422118 10.1103/PhysRevResearch.3.L022018 10.1162/neco.1989.1.4.541 10.3389/fphy.2019.00182 10.1109/TSSC.1969.300225 10.1103/PhysRevA.45.6056 10.1088/2632-2153/ac4f3f 10.1162/neco.1997.9.8.1735 10.1103/PhysRevLett.121.040601 10.1002/j.1537-2197.1987.tb08741.x 10.1103/PhysRevE.87.032159 10.1016/j.neuron.2018.07.003 10.1088/0305-4470/22/12/004 10.1103/PhysRevE.102.032209 10.1103/PhysRevA.89.042120 10.1088/1367-2630/ac0f18 10.1103/PhysRevLett.105.050403 10.1038/ncomms14106 10.1103/PhysRevX.8.031003 10.1103/PhysRevA.104.032212 10.1088/1742-5468/ab33fa 10.1103/PhysRevLett.61.259 10.1175/JCLI-D-17-0559.1 10.1088/1742-5468/abc61d 10.1364/AO.51.005522 |
ContentType | Journal Article |
Copyright | 2024 The Author(s). Published by IOP Publishing Ltd 2024 The Author(s). Published by IOP Publishing Ltd. This work is published under http://creativecommons.org/licenses/by/4.0 (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
Copyright_xml | – notice: 2024 The Author(s). Published by IOP Publishing Ltd – notice: 2024 The Author(s). Published by IOP Publishing Ltd. This work is published under http://creativecommons.org/licenses/by/4.0 (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
DBID | O3W TSCCA AAYXX CITATION 3V. 7XB 88I 8FE 8FG 8FK ABUWG AFKRA ARAPS AZQEC BENPR BGLVJ CCPQU DWQXO GNUQQ HCIFZ JQ2 K7- M2P P62 PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI Q9U DOA |
DOI | 10.1088/2632-2153/ad2feb |
DatabaseName | Institute of Physics Open Access Journal Titles IOPscience (Open Access) CrossRef ProQuest Central (Corporate) ProQuest Central (purchase pre-March 2016) Science Database (Alumni Edition) ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) ProQuest Central (Alumni) ProQuest Central UK/Ireland Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One Community College ProQuest Central ProQuest Central Student ProQuest SciTech Premium Collection ProQuest Computer Science Collection Computer Science Database (Proquest) Science Database ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central Basic DOAJ Directory of Open Access Journals |
DatabaseTitle | CrossRef Publicly Available Content Database Computer Science Database ProQuest Central Student Technology Collection ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Central ProQuest One Applied & Life Sciences ProQuest Central Korea ProQuest Central (New) Advanced Technologies & Aerospace Collection ProQuest Science Journals (Alumni Edition) ProQuest Central Basic ProQuest Science Journals ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest SciTech Collection ProQuest One Academic UKI Edition ProQuest One Academic ProQuest One Academic (New) ProQuest Central (Alumni) |
DatabaseTitleList | CrossRef Publicly Available Content Database |
Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: O3W name: Institute of Physics Open Access Journal Titles url: http://iopscience.iop.org/ sourceTypes: Publisher – sequence: 3 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISSN | 2632-2153 |
ExternalDocumentID | oai_doaj_org_article_340d7140992545d6816cbb17106f49cb 10_1088_2632_2153_ad2feb mlstad2feb |
GrantInformation_xml | – fundername: Chicago Prize Postdoctoral Fellowship |
GroupedDBID | 88I ABHWH ABUWG ACHIP AFKRA AKPSB ALMA_UNASSIGNED_HOLDINGS ARAPS AZQEC BENPR BGLVJ CCPQU CJUJL DWQXO EBS GNUQQ GROUPED_DOAJ HCIFZ IOP K7- M2P M~E N5L O3W OK1 PIMPY TSCCA AAYXX CITATION PHGZM PHGZT 3V. 7XB 8FE 8FG 8FK JQ2 P62 PKEHL PQEST PQGLB PQQKQ PQUKI Q9U AEINN PUEGO |
ID | FETCH-LOGICAL-c416t-9e67ca55f836ffdd97899b8a8897a018acd1d1cf86d3be1e5b86142833b3069d3 |
IEDL.DBID | O3W |
ISSN | 2632-2153 |
IngestDate | Wed Aug 27 01:25:24 EDT 2025 Sun Jul 13 03:04:47 EDT 2025 Tue Jul 01 01:08:57 EDT 2025 Thu Apr 24 23:07:59 EDT 2025 Sun Aug 18 16:10:27 EDT 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 1 |
Language | English |
License | Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c416t-9e67ca55f836ffdd97899b8a8897a018acd1d1cf86d3be1e5b86142833b3069d3 |
Notes | MLST-101241.R1 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0002-5946-5684 0000-0002-5799-7644 0000-0001-7196-8404 0000-0001-5419-5999 |
OpenAccessLink | https://iopscience.iop.org/article/10.1088/2632-2153/ad2feb |
PQID | 2973186940 |
PQPubID | 4916454 |
PageCount | 16 |
ParticipantIDs | doaj_primary_oai_doaj_org_article_340d7140992545d6816cbb17106f49cb crossref_citationtrail_10_1088_2632_2153_ad2feb crossref_primary_10_1088_2632_2153_ad2feb proquest_journals_2973186940 iop_journals_10_1088_2632_2153_ad2feb |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2024-03-01 |
PublicationDateYYYYMMDD | 2024-03-01 |
PublicationDate_xml | – month: 03 year: 2024 text: 2024-03-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | Bristol |
PublicationPlace_xml | – name: Bristol |
PublicationTitle | Machine learning: science and technology |
PublicationTitleAbbrev | MLST |
PublicationTitleAlternate | Mach. Learn.: Sci. Technol |
PublicationYear | 2024 |
Publisher | IOP Publishing |
Publisher_xml | – name: IOP Publishing |
References | Dosovitskiy (mlstad2febbib16) 2021 Brown (mlstad2febbib4) 2020; vol 33 Engel (mlstad2febbib10) 2001 Lapolla (mlstad2febbib45) 2019; 7 Gardner (mlstad2febbib8) 1989; 22 Brückner (mlstad2febbib62) 2019; 15 Cho (mlstad2febbib40) 2014 Seung (mlstad2febbib9) 1992; 45 Hall (mlstad2febbib47) 2014; 89 Majumder (mlstad2febbib65) 2020; 6 Martinez (mlstad2febbib35) 2012; 51 Huang (mlstad2febbib49) 2021; 104 Pope (mlstad2febbib17) 2021 Sussillo (mlstad2febbib54) 2009; 63 Goodfellow (mlstad2febbib38) 2016 Glorot (mlstad2febbib66) 2010 Richards (mlstad2febbib22) 2021 Vettoretti (mlstad2febbib58) 2018; 31 Goldt (mlstad2febbib19) 2020; 10 Kantz (mlstad2febbib6) 2004; vol 7 Benna (mlstad2febbib28) 2021; 118 Mindlin (mlstad2febbib57) 2017; 27 Belousov (mlstad2febbib61) 2020; 102 Lapolla (mlstad2febbib44) 2021; 3 Rivas (mlstad2febbib48) 2010; 105 Strasberg (mlstad2febbib50) 2018; 121 Ghorbani (mlstad2febbib30) 2019; vol 32 Mavadia (mlstad2febbib64) 2017; 8 Roldán (mlstad2febbib60) 2021; 23 Howard (mlstad2febbib2) 2018 Cavallaro (mlstad2febbib59) 2019 Kingma (mlstad2febbib41) 2014 Goldt (mlstad2febbib20) 2022 Fukushima (mlstad2febbib39) 1969; 5 LeCun (mlstad2febbib37) 1989; 1 Martínez (mlstad2febbib36) 2013; 87 OpenAI (mlstad2febbib5) 2023 Mezard (mlstad2febbib11) 2009 Sompolinsky (mlstad2febbib53) 1988; 61 Gerace (mlstad2febbib29) 2022; 3 Van Kampen (mlstad2febbib34) 1992 Pietzonka (mlstad2febbib32) 2017; 96 Devlin (mlstad2febbib1) 2019 Chizat (mlstad2febbib23) 2020 Spigler (mlstad2febbib26) 2020 Seif (mlstad2febbib42) 2022 Laine (mlstad2febbib46) 2010; 81 Refinetti (mlstad2febbib24) 2021 Simonyan (mlstad2febbib14) 2015 Box (mlstad2febbib7) 2015 Chung (mlstad2febbib18) 2018; 8 Radford (mlstad2febbib3) 2018 Ghorbani (mlstad2febbib21) 2020; vol 33 Krizhevsky (mlstad2febbib13) 2012 He (mlstad2febbib15) 2016 Di Terlizzi (mlstad2febbib33) 2023 Loureiro (mlstad2febbib25) 2021; vol 34 Mastrogiuseppe (mlstad2febbib55) 2018; 99 Tucci (mlstad2febbib31) 2022; 129 Skinner (mlstad2febbib63) 2021; 127 Hochreiter (mlstad2febbib52) 1997; 9 Ellison (mlstad2febbib51) 1987; 74 Kloeden (mlstad2febbib43) 1992 d’Ascoli (mlstad2febbib27) 2021; vol 34 Carleo (mlstad2febbib12) 2019; 91 Schuessler (mlstad2febbib56) 2020; vol 33 |
References_xml | – volume: 129 year: 2022 ident: mlstad2febbib31 article-title: Modeling active non-Markovian oscillations publication-title: Phys. Rev. Lett. doi: 10.1103/PhysRevLett.129.030603 – volume: 127 year: 2021 ident: mlstad2febbib63 article-title: Estimating entropy production from waiting time distributions publication-title: Phys. Rev. Lett. doi: 10.1103/PhysRevLett.127.198101 – volume: 27 year: 2017 ident: mlstad2febbib57 article-title: Nonlinear dynamics in the study of birdsong publication-title: Chaos doi: 10.1063/1.4986932 – volume: 96 year: 2017 ident: mlstad2febbib32 article-title: Finite-time generalization of the thermodynamic uncertainty relation publication-title: Phys. Rev. E doi: 10.1103/PhysRevE.96.012101 – year: 1992 ident: mlstad2febbib34 – volume: 91 year: 2019 ident: mlstad2febbib12 article-title: Machine learning and the physical sciences publication-title: Rev. Mod. Phys. doi: 10.1103/RevModPhys.91.045002 – volume: 63 start-page: 544 year: 2009 ident: mlstad2febbib54 article-title: Generating coherent patterns of activity from chaotic neural networks publication-title: Neuron doi: 10.1016/j.neuron.2009.07.018 – start-page: pp 103 year: 2014 ident: mlstad2febbib40 article-title: On the properties of neural machine translation: encoder–decoder approaches – volume: vol 33 start-page: pp 13352 year: 2020 ident: mlstad2febbib56 article-title: The interplay between randomness and structure during learning in RNNs – volume: 15 start-page: 595 year: 2019 ident: mlstad2febbib62 article-title: Stochastic nonlinear dynamics of confined cell migration in two-state systems publication-title: Nat. Phys. doi: 10.1038/s41567-019-0445-4 – year: 2009 ident: mlstad2febbib11 – year: 2018 ident: mlstad2febbib2 – volume: 6 start-page: 1 year: 2020 ident: mlstad2febbib65 article-title: Real-time calibration with spectator qubits publication-title: npj Quantum Inf. doi: 10.1038/s41534-020-0251-y – volume: 10 year: 2020 ident: mlstad2febbib19 article-title: Modeling the influence of data structure on learning in neural networks: the hidden manifold model publication-title: Phys. Rev. X doi: 10.1103/PhysRevX.10.041044 – start-page: pp 1305 year: 2020 ident: mlstad2febbib23 article-title: Implicit bias of gradient descent for wide two-layer neural networks trained with the logistic loss – start-page: pp 249 year: 2010 ident: mlstad2febbib66 article-title: Understanding the difficulty of training deep feedforward neural networks – volume: vol 32 start-page: pp 9111 year: 2019 ident: mlstad2febbib30 article-title: Limitations of lazy training of two-layers neural network – year: 2018 ident: mlstad2febbib3 article-title: Improving language understanding by generative pre-training – year: 2022 ident: mlstad2febbib42 article-title: Code for data generation and training and testing machine learning models – volume: 81 year: 2010 ident: mlstad2febbib46 article-title: Measure for the non-Markovianity of quantum processes publication-title: Phys. Rev. A doi: 10.1103/PhysRevA.81.062115 – volume: 118 year: 2021 ident: mlstad2febbib28 article-title: Place cells may simply be memory cells: memory compression leads to spatial tuning and history dependence publication-title: Proc. Natl Acad. Sci. doi: 10.1073/pnas.2018422118 – year: 2023 ident: mlstad2febbib33 article-title: Variance sum rule for entropy production – volume: 3 year: 2021 ident: mlstad2febbib44 article-title: Toolbox for quantifying memory in dynamics along reaction coordinates publication-title: Phys. Rev. Res. doi: 10.1103/PhysRevResearch.3.L022018 – start-page: pp 1097 year: 2012 ident: mlstad2febbib13 article-title: ImageNet classification with deep convolutional neural networks – volume: 1 start-page: 541 year: 1989 ident: mlstad2febbib37 article-title: Backpropagation applied to handwritten zip code recognition publication-title: Neural Comput. doi: 10.1162/neco.1989.1.4.541 – volume: 7 start-page: 182 year: 2019 ident: mlstad2febbib45 article-title: Manifestations of projection-induced memory: general theory and the tilted single file publication-title: Front. Phys. doi: 10.3389/fphy.2019.00182 – start-page: pp 3889 year: 2021 ident: mlstad2febbib22 article-title: Asymptotics of ridge(less) regression under general source condition – volume: 5 start-page: 322 year: 1969 ident: mlstad2febbib39 article-title: Visual feature extraction by a multilayered network of analog threshold elements publication-title: IEEE Trans. Syst. Sci. Cybern. doi: 10.1109/TSSC.1969.300225 – volume: vol 7 year: 2004 ident: mlstad2febbib6 – year: 2015 ident: mlstad2febbib14 article-title: Very deep convolutional networks for large-scale image recognition – year: 2021 ident: mlstad2febbib17 article-title: The intrinsic dimension of images and its impact on learning – year: 2015 ident: mlstad2febbib7 – volume: vol 33 start-page: pp 1877 year: 2020 ident: mlstad2febbib4 article-title: Language models are few-shot learners – volume: 45 start-page: 6056 year: 1992 ident: mlstad2febbib9 article-title: Statistical mechanics of learning from examples publication-title: Phys. Rev. A doi: 10.1103/PhysRevA.45.6056 – volume: 3 year: 2022 ident: mlstad2febbib29 article-title: Probing transfer learning with a model of synthetic correlated datasets publication-title: Mach. Learn.: Sci. Technol. doi: 10.1088/2632-2153/ac4f3f – start-page: pp 103 year: 1992 ident: mlstad2febbib43 – year: 2021 ident: mlstad2febbib16 article-title: An image is worth 16x16 words: transformers for image recognition at scale – volume: vol 33 year: 2020 ident: mlstad2febbib21 article-title: When do neural networks outperform kernel methods? – volume: 9 start-page: 1735 year: 1997 ident: mlstad2febbib52 article-title: Long short-term memory publication-title: Neural Comput. doi: 10.1162/neco.1997.9.8.1735 – volume: 121 year: 2018 ident: mlstad2febbib50 article-title: Response functions as quantifiers of non-Markovianity publication-title: Phys. Rev. Lett. doi: 10.1103/PhysRevLett.121.040601 – volume: 74 start-page: 1280 year: 1987 ident: mlstad2febbib51 article-title: Effect of seed dimorphism on the density-dependent dynamics of experimental populations of atriplex triangularis (chenopodiaceae) publication-title: Am. J. Bot. doi: 10.1002/j.1537-2197.1987.tb08741.x – volume: 87 year: 2013 ident: mlstad2febbib36 article-title: Effective heating to several thousand kelvins of an optically trapped sphere in a liquid publication-title: Phys. Rev. E doi: 10.1103/PhysRevE.87.032159 – volume: 99 start-page: 609 year: 2018 ident: mlstad2febbib55 article-title: Linking connectivity, dynamics and computations in low-rank recurrent neural networks publication-title: Neuron doi: 10.1016/j.neuron.2018.07.003 – volume: 22 start-page: 1983 year: 1989 ident: mlstad2febbib8 article-title: Three unfinished works on the optimal storage capacity of networks publication-title: J. Phys. A: Math. Gen. doi: 10.1088/0305-4470/22/12/004 – start-page: pp 426 year: 2022 ident: mlstad2febbib20 article-title: The Gaussian equivalence of generative models for learning with shallow neural networks – year: 2001 ident: mlstad2febbib10 – start-page: pp 770 year: 2016 ident: mlstad2febbib15 article-title: Deep residual learning for image recognition – volume: 102 year: 2020 ident: mlstad2febbib61 article-title: Volterra-series approach to stochastic nonlinear dynamics: linear response of the Van der Pol oscillator driven by white noise publication-title: Phys. Rev. E doi: 10.1103/PhysRevE.102.032209 – volume: 89 year: 2014 ident: mlstad2febbib47 article-title: Canonical form of master equations and characterization of non-Markovianity publication-title: Phys. Rev. A doi: 10.1103/PhysRevA.89.042120 – volume: 23 year: 2021 ident: mlstad2febbib60 article-title: Quantifying entropy production in active fluctuations of the hair-cell bundle from time irreversibility and uncertainty relations publication-title: New J. Phys. doi: 10.1088/1367-2630/ac0f18 – year: 2014 ident: mlstad2febbib41 article-title: Adam: a method for stochastic optimization – volume: 105 year: 2010 ident: mlstad2febbib48 article-title: Entanglement and non-Markovianity of quantum evolutions publication-title: Phys. Rev. Lett. doi: 10.1103/PhysRevLett.105.050403 – volume: 8 start-page: 1 year: 2017 ident: mlstad2febbib64 article-title: Prediction and real-time compensation of qubit decoherence via machine learning publication-title: Nat. Commun. doi: 10.1038/ncomms14106 – volume: 8 year: 2018 ident: mlstad2febbib18 article-title: Classification and geometry of general perceptual manifolds publication-title: Phys. Rev. X doi: 10.1103/PhysRevX.8.031003 – volume: vol 34 year: 2021 ident: mlstad2febbib25 article-title: Learning Gaussian mixtures with generalized linear models: precise asymptotics in high-dimensions – volume: 104 year: 2021 ident: mlstad2febbib49 article-title: Quantifying non-Markovianity via conditional mutual information publication-title: Phys. Rev. A doi: 10.1103/PhysRevA.104.032212 – year: 2019 ident: mlstad2febbib1 article-title: BERT: pre-training of deep bidirectional transformers for language understanding – volume: vol 34 start-page: pp 8506 year: 2021 ident: mlstad2febbib27 article-title: On the interplay between data structure and loss function in classification problems – year: 2019 ident: mlstad2febbib59 article-title: Effective bandwidth of non-Markovian packet traffic publication-title: J. Stat. Mech. doi: 10.1088/1742-5468/ab33fa – year: 2016 ident: mlstad2febbib38 – volume: 61 start-page: 259 year: 1988 ident: mlstad2febbib53 article-title: Chaos in random neural networks publication-title: Phys. Rev. Lett. doi: 10.1103/PhysRevLett.61.259 – volume: 31 start-page: 3423 year: 2018 ident: mlstad2febbib58 article-title: Fast physics and slow physics in the nonlinear Dansgaard–Oeschger relaxation oscillation publication-title: J. Clim. doi: 10.1175/JCLI-D-17-0559.1 – year: 2020 ident: mlstad2febbib26 article-title: Asymptotic learning curves of kernel methods: empirical data versus teacher–student paradigm publication-title: J. Stat. Mech. doi: 10.1088/1742-5468/abc61d – year: 2023 ident: mlstad2febbib5 article-title: Gpt-4 technical report – volume: 51 start-page: 5522 year: 2012 ident: mlstad2febbib35 article-title: Force mapping of an optical trap using an acousto-optical deflector in a time-sharing regime publication-title: Appl. Opt. doi: 10.1364/AO.51.005522 – start-page: pp 8936 year: 2021 ident: mlstad2febbib24 article-title: Classifying high-dimensional Gaussian mixtures: where kernel methods fail and neural networks succeed |
SSID | ssj0002513520 |
Score | 2.256002 |
Snippet | The recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks. While there... |
SourceID | doaj proquest crossref iop |
SourceType | Open Website Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 15053 |
SubjectTerms | Artificial neural networks auto-regressive model Autoregressive models Learning memory Memory tasks Natural language processing Neural networks non-Markovianity Performance degradation Performance enhancement recurrent neural network sequence-to-sequence task statistical physics |
SummonAdditionalLinks | – databaseName: DOAJ Directory of Open Access Journals dbid: DOA link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3LSsQwFA0yKze-xdFRstCFi9B2kuaBKxWHQdCVA7MLzUtEnQ5OXfj33vQxKsK4cVdKSsM9Tc695PYchE6NFcEXghEugiSMck-M4IEIAJwNU6HS2gzm7p6PJ-x2mk-_WX3FnrBGHrgJXEJZ6qKonFJQyuSOy4xbYzIgRh6YsibuvsB534qpuAcDa0NmkbbnkrCSkqhLToDfaFK4YfDmBw_Vcv3ALk_l_NeeXBPNaAtttBkivmxmto3W_GwHbXbuC7hdjLvoAhDGzU-OuAz4NbbMfuByhlsjiEfctUmTqiTdNa6KxfNiD01GNw_XY9JaIRALGVNFlOfCFnkeJOUhOAe1n1JGFlIqUaSZLKzLXGaD5I4an_ncSF5rqVEDNYFydB_1ZuXMHyBMqc8ZN5Z5lTIvoKSJmEAmFGQerM36KOkCo22rEx7tKl50fV4tpY6h1DGUugllH50vn5g3Ghkrxl7FWC_HRXXr-gZgrlvM9V-Y99EZIKXb1bZY8bJBh-XX4NqsS3LF0sP_mMsRWh9CntO0pQ1Qr3p798eQp1TmpP4kPwGZWt9g priority: 102 providerName: Directory of Open Access Journals – databaseName: ProQuest Central dbid: BENPR link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1JS8QwFA4uFy_u4k4OevAQpp2kWfAgjswwCIqIgrfSbIOo09GpB_-9L206gwhzK23awnt5W_LyfQidaSO8KwQjXHhJGOWOaME9EaBw1k2ESmoymLt7Pnxmty_ZS1xwm8a2ytYn1o7aliaskXdqjiXJFUuuJp8ksEaF3dVIobGMVsEFSyi-Vnv9-4fH2SoLRG_IMJK4PwkW1Qn45ATiHO0Utuud_hOPath-iDKv5eSfb64DzmATrcdMEV83qt1CS268jTZaFgYcjXIHXYKmcXPYEZcef4TW2R9cjnEkhBjhtl2aVCVpr3FVTN-mu-h50H-6GZJIiUAMZE4VUY4LU2SZl5R7by3UgEppWUipRJGksjA2tanxkluqXeoyLXmNqUY11AbK0j20Mi7Hbh9hSl3GuDbMqYQ5AaVN0A1kRF5m3pj0AHVaweQm4oUH2or3vN63ljIPosyDKPNGlAfoYvbGpMHKWDC2F2Q9GxdQrusb5dcoj0aTU5bYACioFJSxmeUy5UbrFJIi7pky8JFz0FQerW664GfHrS7ng-cT6nDx4yO01oVMpmk8O0Yr1de3O4FMpNKncbr9Al682b8 priority: 102 providerName: ProQuest |
Title | The impact of memory on learning sequence-to-sequence tasks |
URI | https://iopscience.iop.org/article/10.1088/2632-2153/ad2feb https://www.proquest.com/docview/2973186940 https://doaj.org/article/340d7140992545d6816cbb17106f49cb |
Volume | 5 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwELb6uPTSAm3VFysf4NCD2WTt2GP1RNEuBakPIar2FsWvHqCbqpseuPDbGTvOIgSquFhW5Ec0X-yZicffEPLGWBV8owSTKgATXHpmlAxMIeBiUihdpGQw5xfy7Fp8vq1uV8jJ8i5M-5C3_ndY7YmCexHmgDgYR4ZxhpqKjxs3Cd6sknUOEqLndclvlj9YUHGjcVHko8l_dfxDFSXGflQwOOtf23LSNbMXZDMbifR9_0ovyYqfvyJbQwIGmtfjNjlBkGl_z5G2gd7HqNkftJ3TnAvijg6R0qxr2VCnXbP4ttgh17Pp1w9nLGdDYBaNpo5pL5VtqioAlyE4h-6f1gYaAK2aooTGutKVNoB03PjSVwZkolPjBt0C7fguWZu3c79HKOe-EtJY4XUhvEKvJsKCxlCAKlhb7pPxIJjaZqrwmLHie52OrAHqKMo6irLuRblPjpc9HnqajGfankZZL9tFguv0AMGuM9g1F4WLXIJaowdbOQmltMaUaA_JILTFQd4iUnVecItnJjsasPzdOOXrAqlFcfCfwxySjQlaM33w2RFZ6x6f_Gu0RjozIqsw-zgi66fTi6svo-TTY_np8grL85_TUfoqfwGU-9zj |
linkProvider | IOP Publishing |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwEB6VcoBLeYtCAR_ogYO1Sez4IYQQr2VLH6dW6s3ErwpBN0s3CPVP8RsZO8lWCGlvvUWJk0jz8Hxjj-cDeGmdjKGRnAoZFeVMBGqliFSiwnlVSF1kMpjDIzE74V9O69MN-DOehUllleOcmCdq37q0Rj7JHEtKaF68XfykiTUq7a6OFBq9WeyHy9-Ysi3f7H1E_e5W1fTT8YcZHVgFqEPw0VEdhHRNXUfFRIzeYxqltVWNUlo2Raka50tfuqiEZzaUobZK5LZkzCK81p7hd2_ATc6YTh6lpp9XazqIFRDPFMNuKPrvJHVDpxhV2aTxVQz2n-iXSQIwpn1rF_9Fghzepndha8Cl5F1vSPdgI8zvw52R84EMU8ADeI12RfqjlaSN5DwV6l6Sdk4G-okzMhZn066l4zXpmuX35UM4uRZRPYLNeTsPj4EwFmourONBFzxITKSSJSD-iqqOzpXbMBkFY9zQnTyRZPwweZdcKZNEaZIoTS_KbXi1emPRd-ZYM_Z9kvVqXOqpnW-0F2dmcFHDeOFT-0KtMWmuvVClcNaWCMFE5NrhR3ZRU2bw8eWan-2MurwafGW-T9Y_fgG3ZseHB-Zg72j_KdyuEEP1JW87sNld_ArPEAN19nk2PAJfr9vS_wIe7RVk |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9wwELYKlRAXoAXEq60P7YGD2WTt-CFOlHZFeZUDCG5W_OqhsEFsOPDvGTvOIgRC3KxobEczsWcmHn8fQt-NFcHXghEugiSMck-M4IEIMDgbFkIViQzm5JQfXLDDq-oq85ymuzDNbd76d6DZAQV3KswFcXIQEcYJeCo6qN0wxNJpF2bQx4pyHrkb_tLL6U8WcN4QYBT5ePK1zs_cUULtBycDM7_YmpO_GS2hhRwo4r3utT6hD378GS32JAw4r8lltAuGxt1dR9wEfBMrZx9wM8aZD-If7qulSduQvo3bevJ_soIuRr_P9w9IZkQgFgKnlijPha2rKkjKQ3AOUkCljKylVKIuSllbV7rSBskdNb70lZE8QapRA6mBcnQVzY6bsV9DmFJfMW4s86pgXkBmE00DAVGQVbC2XEeDXjHaZrjwyFpxrdOxtZQ6qlJHVepOletoe9rjtoPKeEP2Z9T1VC6CXKcHYHCdDa4pK1zEE1QKstjKcVlya0wJMREPTFkY5AdYSudFN3ljsq3elk_CibNLcsWKjXcO8w3Nnf0a6eM_p0ebaH4IwU1Xi7aFZtu7e_8FgpPWfE0f4CO1ANsj |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+impact+of+memory+on+learning+sequence-to-sequence+tasks&rft.jtitle=Machine+learning%3A+science+and+technology&rft.au=Seif%2C+Alireza&rft.au=Loos%2C+Sarah+A+M&rft.au=Tucci%2C+Gennaro&rft.au=Rold%C3%A1n%2C+%C3%89dgar&rft.date=2024-03-01&rft.pub=IOP+Publishing&rft.eissn=2632-2153&rft.volume=5&rft.issue=1&rft_id=info:doi/10.1088%2F2632-2153%2Fad2feb&rft.externalDocID=mlstad2feb |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2632-2153&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2632-2153&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2632-2153&client=summon |