A generalized LSTM-like training algorithm for second-order recurrent neural networks

The Long Short Term Memory (LSTM) is a second-order recurrent neural network architecture that excels at storing sequential short-term memories and retrieving them many time-steps later. LSTM’s original training algorithm provides the important properties of spatial and temporal locality, which are...

Full description

Saved in:

Bibliographic Details
Published in	Neural networks Vol. 25; no. 1; pp. 70 - 83
Main Authors	Monner, Derek, Reggia, James A.
Format	Journal Article
Language	English
Published	Kidlington Elsevier Ltd 01.01.2012 Elsevier
Subjects	Algorithms Applied sciences Artificial intelligence Computer science; control theory; systems Computer systems and distributed systems. User interface Connectionism. Neural networks Exact sciences and technology Gradient-based training Learning - physiology Long Short Term Memory (LSTM) Neural Networks, Computer Recurrent neural network Sequential retrieval Software Temporal sequence processing Time Factors Gradient-based training Long Short Term Memory (LSTM) Recurrent neural network Sequential retrieval Temporal sequence processing Short term Recurrent neural nets Gradient Locality Network architecture Neural network Long term Temporal logic Local network Memory effect
Online Access	Get full text

Cover

Loading…

Abstract	The Long Short Term Memory (LSTM) is a second-order recurrent neural network architecture that excels at storing sequential short-term memories and retrieving them many time-steps later. LSTM’s original training algorithm provides the important properties of spatial and temporal locality, which are missing from other training approaches, at the cost of limiting its applicability to a small set of network architectures. Here we introduce the Generalized Long Short-Term Memory(LSTM-g) training algorithm, which provides LSTM-like locality while being applicable without modification to a much wider range of second-order network architectures. With LSTM-g, all units have an identical set of operating instructions for both activation and learning, subject only to the configuration of their local environment in the network; this is in contrast to the original LSTM training algorithm, where each type of unit has its own activation and training instructions. When applied to LSTM architectures with peephole connections, LSTM-g takes advantage of an additional source of back-propagated error which can enable better performance than the original algorithm. Enabled by the broad architectural applicability of LSTM-g, we demonstrate that training recurrent networks engineered for specific tasks can produce better results than single-layer networks. We conclude that LSTM-g has the potential to both improve the performance and broaden the applicability of spatially and temporally local gradient-based training algorithms for recurrent neural networks.
AbstractList	The long short term memory (LSTM) is a second-order recurrent neural network architecture that excels at storing sequential short-term memories and retrieving them many time-steps later. LSTM's original training algorithm provides the important properties of spatial and temporal locality, which are missing from other training approaches, at the cost of limiting its applicability to a small set of network architectures. Here we introduce the generalized long short-term memory(LSTM-g) training algorithm, which provides LSTM-like locality while being applicable without modification to a much wider range of second-order network architectures. With LSTM-g, all units have an identical set of operating instructions for both activation and learning, subject only to the configuration of their local environment in the network; this is in contrast to the original LSTM training algorithm, where each type of unit has its own activation and training instructions. When applied to LSTM architectures with peephole connections, LSTM-g takes advantage of an additional source of back-propagated error which can enable better performance than the original algorithm. Enabled by the broad architectural applicability of LSTM-g, we demonstrate that training recurrent networks engineered for specific tasks can produce better results than single-layer networks. We conclude that LSTM-g has the potential to both improve the performance and broaden the applicability of spatially and temporally local gradient-based training algorithms for recurrent neural networks. The Long Short Term Memory (LSTM) is a second-order recurrent neural network architecture that excels at storing sequential short-term memories and retrieving them many time-steps later. LSTM’s original training algorithm provides the important properties of spatial and temporal locality, which are missing from other training approaches, at the cost of limiting its applicability to a small set of network architectures. Here we introduce the Generalized Long Short-Term Memory(LSTM-g) training algorithm, which provides LSTM-like locality while being applicable without modification to a much wider range of second-order network architectures. With LSTM-g, all units have an identical set of operating instructions for both activation and learning, subject only to the configuration of their local environment in the network; this is in contrast to the original LSTM training algorithm, where each type of unit has its own activation and training instructions. When applied to LSTM architectures with peephole connections, LSTM-g takes advantage of an additional source of back-propagated error which can enable better performance than the original algorithm. Enabled by the broad architectural applicability of LSTM-g, we demonstrate that training recurrent networks engineered for specific tasks can produce better results than single-layer networks. We conclude that LSTM-g has the potential to both improve the performance and broaden the applicability of spatially and temporally local gradient-based training algorithms for recurrent neural networks. The Long Short Term Memory (LSTM) is a second-order recurrent neural network architecture that excels at storing sequential short-term memories and retrieving them many time-steps later. LSTM’s original training algorithm provides the important properties of spatial and temporal locality, which are missing from other training approaches, at the cost of limiting it’s applicability to a small set of network architectures. Here we introduce the Generalized Long Short-Term Memory (LSTM-g) training algorithm, which provides LSTM-like locality while being applicable without modification to a much wider range of second-order network architectures. With LSTM-g, all units have an identical set of operating instructions for both activation and learning, subject only to the configuration of their local environment in the network; this is in contrast to the original LSTM training algorithm, where each type of unit has its own activation and training instructions. When applied to LSTM architectures with peephole connections, LSTM-g takes advantage of an additional source of back-propagated error which can enable better performance than the original algorithm. Enabled by the broad architectural applicability of LSTM-g, we demonstrate that training recurrent networks engineered for specific tasks can produce better results than single-layer networks. We conclude that LSTM-g has the potential to both improve the performance and broaden the applicability of spatially and temporally local gradient-based training algorithms for recurrent neural networks. The long short term memory (LSTM) is a second-order recurrent neural network architecture that excels at storing sequential short-term memories and retrieving them many time-steps later. LSTM's original training algorithm provides the important properties of spatial and temporal locality, which are missing from other training approaches, at the cost of limiting its applicability to a small set of network architectures. Here we introduce the generalized long short-term memory(LSTM-g) training algorithm, which provides LSTM-like locality while being applicable without modification to a much wider range of second-order network architectures. With LSTM-g, all units have an identical set of operating instructions for both activation and learning, subject only to the configuration of their local environment in the network; this is in contrast to the original LSTM training algorithm, where each type of unit has its own activation and training instructions. When applied to LSTM architectures with peephole connections, LSTM-g takes advantage of an additional source of back-propagated error which can enable better performance than the original algorithm. Enabled by the broad architectural applicability of LSTM-g, we demonstrate that training recurrent networks engineered for specific tasks can produce better results than single-layer networks. We conclude that LSTM-g has the potential to both improve the performance and broaden the applicability of spatially and temporally local gradient-based training algorithms for recurrent neural networks.The long short term memory (LSTM) is a second-order recurrent neural network architecture that excels at storing sequential short-term memories and retrieving them many time-steps later. LSTM's original training algorithm provides the important properties of spatial and temporal locality, which are missing from other training approaches, at the cost of limiting its applicability to a small set of network architectures. Here we introduce the generalized long short-term memory(LSTM-g) training algorithm, which provides LSTM-like locality while being applicable without modification to a much wider range of second-order network architectures. With LSTM-g, all units have an identical set of operating instructions for both activation and learning, subject only to the configuration of their local environment in the network; this is in contrast to the original LSTM training algorithm, where each type of unit has its own activation and training instructions. When applied to LSTM architectures with peephole connections, LSTM-g takes advantage of an additional source of back-propagated error which can enable better performance than the original algorithm. Enabled by the broad architectural applicability of LSTM-g, we demonstrate that training recurrent networks engineered for specific tasks can produce better results than single-layer networks. We conclude that LSTM-g has the potential to both improve the performance and broaden the applicability of spatially and temporally local gradient-based training algorithms for recurrent neural networks.
Author	Monner, Derek Reggia, James A.
AuthorAffiliation	Department of Computer Science, University of Maryland, College Park, MD 20742, USA
AuthorAffiliation_xml	– name: Department of Computer Science, University of Maryland, College Park, MD 20742, USA
Author_xml	– sequence: 1 givenname: Derek surname: Monner fullname: Monner, Derek email: dmonner@cs.umd.edu – sequence: 2 givenname: James A. surname: Reggia fullname: Reggia, James A. email: reggia@cs.umd.edu
BackLink	http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=25331405$$DView record in Pascal Francis https://www.ncbi.nlm.nih.gov/pubmed/21803542$$D View this record in MEDLINE/PubMed
BookMark	eNqFkktv1DAUhS1URKeFf4BQNqirpH7EccwCqap4SYNY0K4tx76Zepqxi-0Uwa_Ho5mWwqJd3cX97tHRuecIHfjgAaHXBDcEk-503XiYPeSGYkIaLBqM2TO0IL2QNRU9PUAL3EtWd7jHh-gopTXGuOtb9gIdUtJjxlu6QJdn1Qo8RD2532Cr5feLr_XkrqHKUTvv_KrS0ypEl6821RhilcAEb-sQLcQqgpljBJ-rYqVIlJF_hnidXqLno54SvNrPY3T58cPF-ed6-e3Tl_OzZW04bXNNO6N7IoyEAQ_ctoPUVBBpgEIPsoWRUt1aOlDB-DBwwezQcswoL2vTacGO0fud7s08bMCaYqXYUDfRbXT8pYJ26t-Nd1dqFW4Vo0QQwYrAyV4ghh8zpKw2LhmYJu0hzElJQiUlXPRPkyVmQjouC_nmoal7N3ehF-DtHtDJ6GmM2huX_nKcMdJiXrh2x5kYUoow3iMEq20H1FrtOqC2HVBYqNKBcvbuvzPjss4ubCNw01PH-0ShvO3WQVTJOPAGrCvfzsoG97jAH9Hu0JQ
CitedBy_id	crossref_primary_10_1016_j_cropro_2024_107003 crossref_primary_10_1016_j_neunet_2013_01_010 crossref_primary_10_1016_j_anucene_2020_108077 crossref_primary_10_1088_1361_6501_aab945 crossref_primary_10_1007_s10614_020_10008_2 crossref_primary_10_1109_ACCESS_2019_2957837 crossref_primary_10_1162_neco_a_01339 crossref_primary_10_1109_TNNLS_2013_2270376 crossref_primary_10_1007_s11837_024_06408_6 crossref_primary_10_1016_j_neunet_2015_07_006 crossref_primary_10_1109_TAI_2023_3265641 crossref_primary_10_1007_s11869_023_01385_2 crossref_primary_10_1016_j_net_2018_03_010 crossref_primary_10_3390_app9132656 crossref_primary_10_1080_09540091_2013_798262 crossref_primary_10_1016_j_jag_2021_102344 crossref_primary_10_1016_j_asoc_2024_111512 crossref_primary_10_1016_j_scs_2024_105378 crossref_primary_10_1109_MIE_2020_3026197 crossref_primary_10_1017_S1366728912000454 crossref_primary_10_1109_LGRS_2021_3072191 crossref_primary_10_1177_0142331220932390 crossref_primary_10_2514_1_A35897 crossref_primary_10_3390_rs12244172 crossref_primary_10_1016_j_asoc_2017_10_030 crossref_primary_10_1016_j_pnucene_2021_104005 crossref_primary_10_3390_info15040175 crossref_primary_10_1016_j_anucene_2018_05_020 crossref_primary_10_1016_j_neucom_2019_08_058 crossref_primary_10_1109_MCI_2016_2601759 crossref_primary_10_7769_gesec_v13i4_1488 crossref_primary_10_1016_j_asoc_2013_03_019 crossref_primary_10_1016_j_imu_2023_101284 crossref_primary_10_1016_j_bica_2012_06_002 crossref_primary_10_1080_19475705_2024_2383270 crossref_primary_10_1016_j_mechmachtheory_2018_11_005 crossref_primary_10_3846_jcem_2021_14649 crossref_primary_10_1371_journal_pone_0266186 crossref_primary_10_3390_s21103551 crossref_primary_10_1007_s42979_022_01118_9 crossref_primary_10_1016_j_bica_2015_09_003 crossref_primary_10_1080_03019233_2021_1959871 crossref_primary_10_1007_s11042_023_14432_y crossref_primary_10_1115_1_4054955 crossref_primary_10_1109_ACCESS_2020_3036726 crossref_primary_10_3390_ijerph20115943 crossref_primary_10_1109_TITS_2021_3098309 crossref_primary_10_3390_app14177773 crossref_primary_10_1109_TNNLS_2019_2935796
Cites_doi	10.1109/IJCNN.2000.861302 10.1162/neco.1997.9.8.1735 10.1109/5.58337 10.1007/978-3-642-04277-5_76 10.1109/72.279191 10.1207/s15516709cog1402_1 10.1364/AO.26.004972 10.1007/978-3-642-22887-2_12 10.1162/neco.1989.1.2.270 10.1007/978-3-540-27835-1_10 10.1162/neco.2007.19.3.757 10.1016/0004-3702(89)90049-0 10.1109/IJCNN.2009.5179016 10.1162/089976600300015015 10.1016/j.bandl.2006.06.001 10.1109/IJCNN.1991.155142 10.1109/72.963769 10.1016/S0893-6080(02)00219-8 10.1038/323533a0 10.1142/S0218001493000431 10.1016/0893-6080(88)90017-2
ContentType	Journal Article
Copyright	2011 Elsevier Ltd 2015 INIST-CNRS Copyright © 2011 Elsevier Ltd. All rights reserved. 2011 Elsevier Ltd. All rights reserved. 2011
Copyright_xml	– notice: 2011 Elsevier Ltd – notice: 2015 INIST-CNRS – notice: Copyright © 2011 Elsevier Ltd. All rights reserved. – notice: 2011 Elsevier Ltd. All rights reserved. 2011
DBID	AAYXX CITATION IQODW CGR CUY CVF ECM EIF NPM 7X8 7QO 7TK 8FD FR3 P64 5PM
DOI	10.1016/j.neunet.2011.07.003
DatabaseName	CrossRef Pascal-Francis Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic Biotechnology Research Abstracts Neurosciences Abstracts Technology Research Database Engineering Research Database Biotechnology and BioEngineering Abstracts PubMed Central (Full Participant titles)
DatabaseTitle	CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic Engineering Research Database Biotechnology Research Abstracts Technology Research Database Neurosciences Abstracts Biotechnology and BioEngineering Abstracts
DatabaseTitleList	MEDLINE MEDLINE - Academic Engineering Research Database
Database_xml	– sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science Applied Sciences
EISSN	1879-2782
EndPage	83
ExternalDocumentID	PMC3217173 21803542 25331405 10_1016_j_neunet_2011_07_003 S0893608011002036
Genre	Research Support, U.S. Gov't, Non-P.H.S Comparative Study Journal Article Research Support, N.I.H., Extramural
GrantInformation_xml	– fundername: NICHD NIH HHS grantid: HD064653 – fundername: NICHD NIH HHS grantid: P01 HD064653
GroupedDBID	--- --K --M -~X .DC .~1 0R~ 123 186 1B1 1RT 1~. 1~5 29N 4.4 457 4G. 53G 5RE 5VS 6TJ 7-5 71M 8P~ 9JM 9JN AABNK AACTN AADPK AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXLA AAXUO AAYFN ABAOU ABBOA ABCQJ ABEFU ABFNM ABFRF ABHFT ABIVO ABJNI ABLJU ABMAC ABXDB ABYKQ ACAZW ACDAQ ACGFO ACGFS ACIUM ACNNM ACRLP ACZNC ADBBV ADEZE ADGUI ADJOM ADMUD ADRHT AEBSH AECPX AEFWE AEKER AENEX AFKWA AFTJW AFXIZ AGHFR AGUBO AGWIK AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ARUGR ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F0J F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q G8K GBLVA GBOLZ HLZ HMQ HVGLF HZ~ IHE J1W JJJVA K-O KOM KZ1 LG9 LMP M2V M41 MHUIS MO0 MOBAO MVM N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- RIG ROL RPZ SBC SCC SDF SDG SDP SES SEW SNS SPC SPCBC SSN SST SSV SSW SSZ T5K TAE UAP UNMZH VOH WUQ XPP ZMT ~G- AATTM AAXKI AAYWO AAYXX ABDPE ABWVN ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGCQF AGQPQ AGRNS AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP BNPGV CITATION SSH EFKBS IQODW CGR CUY CVF ECM EIF NPM 7X8 7QO 7TK 8FD FR3 P64 5PM
ID	FETCH-LOGICAL-c524t-26ca817c9eb0b5d4b9a2719ce2e8e94ef22a4d2b2735bb573db450325e8ec6a73
IEDL.DBID	.~1
ISSN	0893-6080 1879-2782
IngestDate	Thu Aug 21 13:32:00 EDT 2025 Fri Jul 11 05:10:31 EDT 2025 Thu Jul 10 18:09:52 EDT 2025 Sat May 31 02:06:23 EDT 2025 Mon Jul 21 09:14:59 EDT 2025 Tue Jul 01 01:24:25 EDT 2025 Thu Apr 24 23:02:20 EDT 2025 Fri Feb 23 02:28:38 EST 2024
IsDoiOpenAccess	false
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	1
Keywords	Gradient-based training Long Short Term Memory (LSTM) Recurrent neural network Sequential retrieval Temporal sequence processing Short term Recurrent neural nets Gradient Locality Network architecture Neural network Long term Temporal logic Local network Memory effect
Language	English
License	https://www.elsevier.com/tdm/userlicense/1.0 CC BY 4.0 Copyright © 2011 Elsevier Ltd. All rights reserved.
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c524t-26ca817c9eb0b5d4b9a2719ce2e8e94ef22a4d2b2735bb573db450325e8ec6a73
Notes	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 ObjectType-Article-1 ObjectType-Feature-2
OpenAccessLink	https://www.ncbi.nlm.nih.gov/pmc/articles/3217173
PMID	21803542
PQID	908011659
PQPubID	23479
PageCount	14
ParticipantIDs	pubmedcentral_primary_oai_pubmedcentral_nih_gov_3217173 proquest_miscellaneous_912921578 proquest_miscellaneous_908011659 pubmed_primary_21803542 pascalfrancis_primary_25331405 crossref_primary_10_1016_j_neunet_2011_07_003 crossref_citationtrail_10_1016_j_neunet_2011_07_003 elsevier_sciencedirect_doi_10_1016_j_neunet_2011_07_003
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2012-01-01
PublicationDateYYYYMMDD	2012-01-01
PublicationDate_xml	– month: 01 year: 2012 text: 2012-01-01 day: 01
PublicationDecade	2010
PublicationPlace	Kidlington
PublicationPlace_xml	– name: Kidlington – name: United States
PublicationTitle	Neural networks
PublicationTitleAlternate	Neural Netw
PublicationYear	2012
Publisher	Elsevier Ltd Elsevier
Publisher_xml	– name: Elsevier Ltd – name: Elsevier
References	Gers, F. A., & Schmidhuber, J. (2000). Recurrent nets that time and count. In Rumelhart, Hinton, Williams (br000085) 1986; 323 Bayer, J., Wierstra, D., Togelius, J., & Schmidhuber, J. (2009). Evolving memory cell structures for sequence learning. In Puskorius, Feldkamp (br000080) 1994; 5 Gers, Cummins (br000015) 2000; 12 Hinton (br000050) 1989; 40 Monner, D., & Reggia, J. A. (2009). An unsupervised learning method for representing simple sentences. In Schmidhuber, Wierstra, Gagliolo, Gomez (br000090) 2007; 19 Miller, Giles (br000060) 1993; 7 Psaltis, Park, Hong (br000075) 1988; 1 Giles, Maxwell (br000035) 1987; 26 Sutton, Barto (br000100) 1998 Graves, Schmidhuber (br000045) 2008 Elman (br000010) 1990; 14 Weems, Reggia (br000105) 2006; 98 Werbos (br000110) 1990; 78 (pp. 127–136). Monner, D., & Reggia, J. A. (2011). Systematically grounding language through vision in a deep, recurrent neural network. To appear Gers, Schmidhuber (br000030) 2001; 12 (pp. 13–18). . Williams, Zipser (br000115) 1989; 1 Hochreiter, Schmidhuber (br000055) 1997; 9 (pp. 189–194). Gers, Pérez-Ortiz, Eck, Schmidhuber (br000020) 2003; 16 Shin, Y., & Ghosh, J. (1991). The pi-sigma network: an efficient higher-order neural network for pattern classification and function approximation. In Graves, A., Eck, D., Beringer, N., & Schmidhuber, J. (2004). Biologically plausible speech recognition with LSTM neural nets. In (pp. 2133–2140). Gers (10.1016/j.neunet.2011.07.003_br000020) 2003; 16 Miller (10.1016/j.neunet.2011.07.003_br000060) 1993; 7 Hinton (10.1016/j.neunet.2011.07.003_br000050) 1989; 40 Sutton (10.1016/j.neunet.2011.07.003_br000100) 1998 Gers (10.1016/j.neunet.2011.07.003_br000030) 2001; 12 10.1016/j.neunet.2011.07.003_br000040 10.1016/j.neunet.2011.07.003_br000095 Elman (10.1016/j.neunet.2011.07.003_br000010) 1990; 14 Giles (10.1016/j.neunet.2011.07.003_br000035) 1987; 26 Puskorius (10.1016/j.neunet.2011.07.003_br000080) 1994; 5 Gers (10.1016/j.neunet.2011.07.003_br000015) 2000; 12 Graves (10.1016/j.neunet.2011.07.003_br000045) 2008 10.1016/j.neunet.2011.07.003_br000070 Williams (10.1016/j.neunet.2011.07.003_br000115) 1989; 1 Hochreiter (10.1016/j.neunet.2011.07.003_br000055) 1997; 9 10.1016/j.neunet.2011.07.003_br000005 10.1016/j.neunet.2011.07.003_br000025 Werbos (10.1016/j.neunet.2011.07.003_br000110) 1990; 78 10.1016/j.neunet.2011.07.003_br000065 Schmidhuber (10.1016/j.neunet.2011.07.003_br000090) 2007; 19 Psaltis (10.1016/j.neunet.2011.07.003_br000075) 1988; 1 Rumelhart (10.1016/j.neunet.2011.07.003_br000085) 1986; 323 Weems (10.1016/j.neunet.2011.07.003_br000105) 2006; 98
References_xml	– reference: (pp. 127–136). – volume: 40 start-page: 185 year: 1989 end-page: 234 ident: br000050 article-title: Connectionist learning procedures publication-title: Artificial Intelligence – reference: Graves, A., Eck, D., Beringer, N., & Schmidhuber, J. (2004). Biologically plausible speech recognition with LSTM neural nets. In – reference: (pp. 2133–2140). – reference: Monner, D., & Reggia, J. A. (2009). An unsupervised learning method for representing simple sentences. In – reference: Gers, F. A., & Schmidhuber, J. (2000). Recurrent nets that time and count. In – year: 2008 ident: br000045 article-title: offline handwriting recognition with multidimensional recurrent neural networks publication-title: Neural information processing systems: Vol. 21 – reference: (pp. 13–18). – volume: 1 start-page: 149 year: 1988 end-page: 163 ident: br000075 article-title: Higher order associative memories and their optical implementations publication-title: Neural Networks – volume: 19 start-page: 757 year: 2007 end-page: 779 ident: br000090 article-title: Training recurrent networks by Evolino publication-title: Neural Computation – reference: Monner, D., & Reggia, J. A. (2011). Systematically grounding language through vision in a deep, recurrent neural network. To appear – volume: 5 start-page: 279 year: 1994 end-page: 297 ident: br000080 article-title: Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks publication-title: IEEE Transactions on Neural Networks – volume: 14 start-page: 179 year: 1990 end-page: 211 ident: br000010 article-title: Finding structure in time publication-title: Cognitive Science – volume: 78 start-page: 1550 year: 1990 end-page: 1560 ident: br000110 article-title: Backpropagation through time: what it does and how to do it publication-title: Proceedings of the IEEE – volume: 12 start-page: 2451 year: 2000 end-page: 2471 ident: br000015 article-title: learning to forget: continual prediction with LSTM publication-title: Neural Computation – volume: 7 start-page: 849 year: 1993 end-page: 872 ident: br000060 article-title: Experimental comparison of the effect of order in recurrent neural networks publication-title: Pattern Recognition – reference: Shin, Y., & Ghosh, J. (1991). The pi-sigma network: an efficient higher-order neural network for pattern classification and function approximation. In – start-page: 167 year: 1998 end-page: 200 ident: br000100 article-title: Temporal-difference learning publication-title: Reinforcement learning: an introduction – reference: (pp. 189–194). – volume: 323 start-page: 533 year: 1986 end-page: 536 ident: br000085 article-title: Learning representations by back-propagating errors publication-title: Nature – volume: 26 start-page: 4972 year: 1987 end-page: 4978 ident: br000035 article-title: Learning, invariance, and generalization in high-order neural networks publication-title: Applied Optics – volume: 1 start-page: 270 year: 1989 end-page: 280 ident: br000115 article-title: A learning algorithm for continually running fully recurrent neural networks publication-title: Neural Computation – reference: Bayer, J., Wierstra, D., Togelius, J., & Schmidhuber, J. (2009). Evolving memory cell structures for sequence learning. In – volume: 12 start-page: 1333 year: 2001 end-page: 1340 ident: br000030 article-title: LSTM recurrent networks learn simple context-free and context-sensitive languages publication-title: IEEE Transactions on Neural Networks – volume: 98 start-page: 291 year: 2006 end-page: 309 ident: br000105 article-title: Simulating single word processing in the classic aphasia syndromes based on the Wernicke–Lichtheim–Geschwind theory publication-title: Brain and Language – volume: 16 start-page: 241 year: 2003 end-page: 250 ident: br000020 article-title: Kalman filters improve LSTM network performancee in problems unsolvable by traditional recurrent nets publication-title: Neural Networks – reference: . – volume: 9 start-page: 1735 year: 1997 end-page: 1780 ident: br000055 article-title: Long short-term memory publication-title: Neural Computation – ident: 10.1016/j.neunet.2011.07.003_br000025 doi: 10.1109/IJCNN.2000.861302 – volume: 9 start-page: 1735 year: 1997 ident: 10.1016/j.neunet.2011.07.003_br000055 article-title: Long short-term memory publication-title: Neural Computation doi: 10.1162/neco.1997.9.8.1735 – volume: 78 start-page: 1550 year: 1990 ident: 10.1016/j.neunet.2011.07.003_br000110 article-title: Backpropagation through time: what it does and how to do it publication-title: Proceedings of the IEEE doi: 10.1109/5.58337 – ident: 10.1016/j.neunet.2011.07.003_br000005 doi: 10.1007/978-3-642-04277-5_76 – volume: 5 start-page: 279 year: 1994 ident: 10.1016/j.neunet.2011.07.003_br000080 article-title: Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks publication-title: IEEE Transactions on Neural Networks doi: 10.1109/72.279191 – volume: 14 start-page: 179 year: 1990 ident: 10.1016/j.neunet.2011.07.003_br000010 article-title: Finding structure in time publication-title: Cognitive Science doi: 10.1207/s15516709cog1402_1 – year: 2008 ident: 10.1016/j.neunet.2011.07.003_br000045 article-title: offline handwriting recognition with multidimensional recurrent neural networks – volume: 26 start-page: 4972 year: 1987 ident: 10.1016/j.neunet.2011.07.003_br000035 article-title: Learning, invariance, and generalization in high-order neural networks publication-title: Applied Optics doi: 10.1364/AO.26.004972 – start-page: 167 year: 1998 ident: 10.1016/j.neunet.2011.07.003_br000100 article-title: Temporal-difference learning – ident: 10.1016/j.neunet.2011.07.003_br000070 doi: 10.1007/978-3-642-22887-2_12 – volume: 1 start-page: 270 year: 1989 ident: 10.1016/j.neunet.2011.07.003_br000115 article-title: A learning algorithm for continually running fully recurrent neural networks publication-title: Neural Computation doi: 10.1162/neco.1989.1.2.270 – ident: 10.1016/j.neunet.2011.07.003_br000040 doi: 10.1007/978-3-540-27835-1_10 – volume: 19 start-page: 757 year: 2007 ident: 10.1016/j.neunet.2011.07.003_br000090 article-title: Training recurrent networks by Evolino publication-title: Neural Computation doi: 10.1162/neco.2007.19.3.757 – volume: 40 start-page: 185 year: 1989 ident: 10.1016/j.neunet.2011.07.003_br000050 article-title: Connectionist learning procedures publication-title: Artificial Intelligence doi: 10.1016/0004-3702(89)90049-0 – ident: 10.1016/j.neunet.2011.07.003_br000065 doi: 10.1109/IJCNN.2009.5179016 – volume: 12 start-page: 2451 year: 2000 ident: 10.1016/j.neunet.2011.07.003_br000015 article-title: learning to forget: continual prediction with LSTM publication-title: Neural Computation doi: 10.1162/089976600300015015 – volume: 98 start-page: 291 year: 2006 ident: 10.1016/j.neunet.2011.07.003_br000105 article-title: Simulating single word processing in the classic aphasia syndromes based on the Wernicke–Lichtheim–Geschwind theory publication-title: Brain and Language doi: 10.1016/j.bandl.2006.06.001 – ident: 10.1016/j.neunet.2011.07.003_br000095 doi: 10.1109/IJCNN.1991.155142 – volume: 12 start-page: 1333 year: 2001 ident: 10.1016/j.neunet.2011.07.003_br000030 article-title: LSTM recurrent networks learn simple context-free and context-sensitive languages publication-title: IEEE Transactions on Neural Networks doi: 10.1109/72.963769 – volume: 16 start-page: 241 year: 2003 ident: 10.1016/j.neunet.2011.07.003_br000020 article-title: Kalman filters improve LSTM network performancee in problems unsolvable by traditional recurrent nets publication-title: Neural Networks doi: 10.1016/S0893-6080(02)00219-8 – volume: 323 start-page: 533 year: 1986 ident: 10.1016/j.neunet.2011.07.003_br000085 article-title: Learning representations by back-propagating errors publication-title: Nature doi: 10.1038/323533a0 – volume: 7 start-page: 849 year: 1993 ident: 10.1016/j.neunet.2011.07.003_br000060 article-title: Experimental comparison of the effect of order in recurrent neural networks publication-title: Pattern Recognition doi: 10.1142/S0218001493000431 – volume: 1 start-page: 149 year: 1988 ident: 10.1016/j.neunet.2011.07.003_br000075 article-title: Higher order associative memories and their optical implementations publication-title: Neural Networks doi: 10.1016/0893-6080(88)90017-2
SSID	ssj0006843
Score	2.2969973
Snippet	The Long Short Term Memory (LSTM) is a second-order recurrent neural network architecture that excels at storing sequential short-term memories and retrieving... The long short term memory (LSTM) is a second-order recurrent neural network architecture that excels at storing sequential short-term memories and retrieving...
SourceID	pubmedcentral proquest pubmed pascalfrancis crossref elsevier
SourceType	Open Access Repository Aggregation Database Index Database Enrichment Source Publisher
StartPage	70
SubjectTerms	Algorithms Applied sciences Artificial intelligence Computer science; control theory; systems Computer systems and distributed systems. User interface Connectionism. Neural networks Exact sciences and technology Gradient-based training Learning - physiology Long Short Term Memory (LSTM) Neural Networks, Computer Recurrent neural network Sequential retrieval Software Temporal sequence processing Time Factors
Title	A generalized LSTM-like training algorithm for second-order recurrent neural networks
URI	https://dx.doi.org/10.1016/j.neunet.2011.07.003 https://www.ncbi.nlm.nih.gov/pubmed/21803542 https://www.proquest.com/docview/908011659 https://www.proquest.com/docview/912921578 https://pubmed.ncbi.nlm.nih.gov/PMC3217173
Volume	25
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07b9swECaCdClQtOnbSWpw6Mpa4lMcjaCB-0iWxEA2gqToRK0rG7a8dOhvz1GUnLhoG6CjpKMk8o68j-TxO4TeA2ATMlBHssAE4YF7orlWZFa6woO7V3mbee7sXE6m_POVuNpDJ_1ZmBhW2Y39aUxvR-vuzqhrzdGyqkYXGbhaCYAnkp7F7bR4gp2raOUfft2FecgiRc6BMInS_fG5NsarDps6NB2RZ-QyZH9zT0-Wdg2NNkvZLv4ER3-Pqrznpk4P0NMOX-JxqsJztBfqF-hZn7sBd135JZqO8XVinK5-hhJ_vbg8I_Pqe8B9zghs59eLVdXc_MCAa_E6TpxL0jJ14lVcpI-0TjjSYcLn6hRMvn6FpqcfL08mpEuxQLygvCFUelvkyuvgMidK7rSlKtc-0FAEzcOMUstL6gDkCOeEYqXjImNUwGMvrWKv0X69qMNbhG3ptfJQmjvKyyCt5tIL5VXmS-mkGyDWt6zxHf94rNLc9IFm30zSh4n6MFncGGcDRLallol_4wF51SvN7NiRARfxQMnhjo63n6OAiGEaKgYI90o30Afjxoqtw2KzNrq1Qin0P0QAVwG6UsUAvUlmcvf-vMiY4BR-fceAtgKRAXz3SV3dtEzgDCaUuWKH_13pI_QYrmhaUzpG-81qE94BymrcsO1GQ_Ro_OnL5PwWxWgp2g
linkProvider	Elsevier
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3Nb9MwFLdGdwAJ8Q0rH8MHrlYTf8bHamLqWNvLWmk3K3bcLVDSqk0v--t5jpNCETCJa_ycxH72ez_bz7-H0CcAbEJ6aknimSDcc0c014osCps5cPcqbTLPTaZyNOdfrsX1ETrr7sKEsMrW9keb3ljr9smg7c3BuiwHVwm4WgmAJ5CeheO0B-g4sFOJHjoeXlyOpnuDLLMYPAfyJFTobtA1YV6V31W-brk8A50h-5uHerzOt9Bvi5jw4k-I9PfAyl881fkz9KSFmHgYW_EcHfnqBXrapW_A7Wx-ieZDfBNJp8s7X-Dx1WxCluU3j7u0EThf3qw2ZX37HQO0xduwdi5IQ9aJN2GfPjA74cCICZ-rYjz59hWan3-enY1Im2WBOEF5Tah0eZYqp71NrCi41TlVqXae-sxr7heU5rygFnCOsFYoVlguEkYFFDuZK_Ya9apV5U8QzgunlYPa3FJeeJlrLp1QTiWukFbaPmJdzxrXUpCHJi1NF2v21UR9mKAPk4SzcdZHZF9rHSk47pFXndLMwVAy4CXuqXl6oOP95yiAYliJij7CndINTMNwtpJXfrXbGt0MRCn0P0QAWgHAUlkfvYnD5Of70yxhglP49YMBtBcIJOCHJVV525CBM1hTpoq9_e9Gf0QPR7PJ2Iwvppfv0CMooXGL6T3q1Zud_wCgq7an7aT6AdksLIs
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+generalized+LSTM-like+training+algorithm+for+second-order+recurrent+neural+networks&rft.jtitle=Neural+networks&rft.au=Monner%2C+Derek&rft.au=Reggia%2C+James+A.&rft.date=2012-01-01&rft.pub=Elsevier+Ltd&rft.issn=0893-6080&rft.eissn=1879-2782&rft.volume=25&rft.spage=70&rft.epage=83&rft_id=info:doi/10.1016%2Fj.neunet.2011.07.003&rft.externalDocID=S0893608011002036
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0893-6080&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0893-6080&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0893-6080&client=summon