Scene labeling with LSTM recurrent neural networks
This paper addresses the problem of pixel-level segmentation and classification of scene images with an entirely learning-based approach using Long Short Term Memory (LSTM) recurrent neural networks, which are commonly used for sequence classification. We investigate two-dimensional (2D) LSTM networ...
Saved in:
Published in | 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 3547 - 3555 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding Journal Article |
Language | English |
Published |
IEEE
01.06.2015
|
Subjects | |
Online Access | Get full text |
ISSN | 1063-6919 1063-6919 2575-7075 |
DOI | 10.1109/CVPR.2015.7298977 |
Cover
Loading…
Abstract | This paper addresses the problem of pixel-level segmentation and classification of scene images with an entirely learning-based approach using Long Short Term Memory (LSTM) recurrent neural networks, which are commonly used for sequence classification. We investigate two-dimensional (2D) LSTM networks for natural scene images taking into account the complex spatial dependencies of labels. Prior methods generally have required separate classification and image segmentation stages and/or pre- and post-processing. In our approach, classification, segmentation, and context integration are all carried out by 2D LSTM networks, allowing texture and spatial model parameters to be learned within a single model. The networks efficiently capture local and global contextual information over raw RGB values and adapt well for complex scene images. Our approach, which has a much lower computational complexity than prior methods, achieved state-of-the-art performance over the Stanford Background and the SIFT Flow datasets. In fact, if no pre- or post-processing is applied, LSTM networks outperform other state-of-the-art approaches. Hence, only with a single-core Central Processing Unit (CPU), the running time of our approach is equivalent or better than the compared state-of-the-art approaches which use a Graphics Processing Unit (GPU). Finally, our networks' ability to visualize feature maps from each layer supports the hypothesis that LSTM networks are overall suited for image processing tasks. |
---|---|
AbstractList | This paper addresses the problem of pixel-level segmentation and classification of scene images with an entirely learning-based approach using Long Short Term Memory (LSTM) recurrent neural networks, which are commonly used for sequence classification. We investigate two-dimensional (2D) LSTM networks for natural scene images taking into account the complex spatial dependencies of labels. Prior methods generally have required separate classification and image segmentation stages and/or pre- and post-processing. In our approach, classification, segmentation, and context integration are all carried out by 2D LSTM networks, allowing texture and spatial model parameters to be learned within a single model. The networks efficiently capture local and global contextual information over raw RGB values and adapt well for complex scene images. Our approach, which has a much lower computational complexity than prior methods, achieved state-of-the-art performance over the Stanford Background and the SIFT Flow datasets. In fact, if no pre- or post-processing is applied, LSTM networks outperform other state-of-the-art approaches. Hence, only with a single-core Central Processing Unit (CPU), the running time of our approach is equivalent or better than the compared state-of-the-art approaches which use a Graphics Processing Unit (GPU). Finally, our networks' ability to visualize feature maps from each layer supports the hypothesis that LSTM networks are overall suited for image processing tasks. |
Author | Wonmin Byeon Liwicki, Marcus Breuel, Thomas M. Raue, Federico |
Author_xml | – sequence: 1 surname: Wonmin Byeon fullname: Wonmin Byeon email: wonmin.byeon@dfki.de organization: Univ. of Kaiserslautern, Kaiserslautern, Germany – sequence: 2 givenname: Thomas M. surname: Breuel fullname: Breuel, Thomas M. email: tmb@cs.uni-kl.de organization: Univ. of Kaiserslautern, Kaiserslautern, Germany – sequence: 3 givenname: Federico surname: Raue fullname: Raue, Federico email: federico.raue@dfki.de organization: Univ. of Kaiserslautern, Kaiserslautern, Germany – sequence: 4 givenname: Marcus surname: Liwicki fullname: Liwicki, Marcus email: liwicki@cs.uni-kl.de organization: Univ. of Kaiserslautern, Kaiserslautern, Germany |
BookMark | eNqNkL1OwzAURg0qEqX0ARBLRpYUO7bvtUdU8ScVgWhhjRz3FizSpNiJKt6eSu3AyHS-4egM3xkbNG1DjF0IPhGC2-vp-8vrpOBCT7CwxiIesTOhACVYUPyYDQUHmYMVdvBnn7JxSqHiknNjbcGHrJh7aiirXUV1aD6ybeg-s9l88ZRF8n2M1HRZQ3109Q7dto1f6ZydrFydaHzgiL3d3S6mD_ns-f5xejPLQ6FEl0vrKuu1qRQIJCONU7pAQNTgyDqxWnltuaZKIHgttUZOfgnGuyWBtkqO2NW-u4ntd0-pK9cheapr11Dbp1IgcqkAQP9DBaMROOJOvdyrgYjKTQxrF3_Kw4nyFzzsY28 |
ContentType | Conference Proceeding Journal Article |
DBID | 6IE 6IH CBEJK RIE RIO 7QO 8FD FR3 P64 7SC JQ2 L7M L~C L~D |
DOI | 10.1109/CVPR.2015.7298977 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present Biotechnology Research Abstracts Technology Research Database Engineering Research Database Biotechnology and BioEngineering Abstracts Computer and Information Systems Abstracts ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
DatabaseTitle | Engineering Research Database Biotechnology Research Abstracts Technology Research Database Biotechnology and BioEngineering Abstracts Computer and Information Systems Abstracts Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Computer and Information Systems Abstracts Engineering Research Database |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Applied Sciences Computer Science |
EISBN | 1467369640 9781467369640 |
EISSN | 1063-6919 2575-7075 |
EndPage | 3555 |
ExternalDocumentID | 7298977 |
Genre | orig-research |
GroupedDBID | 23M 29F 29O 6IE 6IH 6IK ABDPE ACGFS ALMA_UNASSIGNED_HOLDINGS CBEJK IPLJI M43 RIE RIO RNS 7QO 8FD FR3 P64 7SC JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-i241t-39ab9c58b4617e838a452767756ae9a1ffc5905eb176c535570ecd68cade65943 |
IEDL.DBID | RIE |
ISSN | 1063-6919 |
IngestDate | Thu Jul 10 17:51:15 EDT 2025 Tue Aug 05 09:59:06 EDT 2025 Wed Aug 27 02:49:18 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i241t-39ab9c58b4617e838a452767756ae9a1ffc5905eb176c535570ecd68cade65943 |
Notes | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Conference-1 ObjectType-Feature-3 content type line 23 SourceType-Conference Papers & Proceedings-2 |
PQID | 1768576077 |
PQPubID | 23462 |
PageCount | 9 |
ParticipantIDs | ieee_primary_7298977 proquest_miscellaneous_1768576077 proquest_miscellaneous_1770346665 |
PublicationCentury | 2000 |
PublicationDate | 20150601 |
PublicationDateYYYYMMDD | 2015-06-01 |
PublicationDate_xml | – month: 06 year: 2015 text: 20150601 day: 01 |
PublicationDecade | 2010 |
PublicationTitle | 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) |
PublicationTitleAbbrev | CVPR |
PublicationYear | 2015 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssib030089920 ssj0023720 ssj0003211698 |
Score | 2.4200058 |
Snippet | This paper addresses the problem of pixel-level segmentation and classification of scene images with an entirely learning-based approach using Long Short Term... |
SourceID | proquest ieee |
SourceType | Aggregation Database Publisher |
StartPage | 3547 |
SubjectTerms | Accuracy Classification Feedforward neural networks Networks Recurrent neural networks Roads Segmentation State of the art Texture Two dimensional Weaving |
Title | Scene labeling with LSTM recurrent neural networks |
URI | https://ieeexplore.ieee.org/document/7298977 https://www.proquest.com/docview/1768576077 https://www.proquest.com/docview/1770346665 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3dS8MwEA_bnnyauonziwo-2i5tvprn4RjiZLhN9lbSNANROtnaF_96L2k7QUV8agmlaS7X-8jd_Q6hG6MirkPYAQ76w6dCp75KmfAjk0mcYZlh7dA-H_lkSe9XbNVCt_taGGOMSz4zgb11sfxso0t7VDYUFi5ciDZqg-NW1Wo1vEOwjV_Vpo-VwgQ8Gy73EYXIdmNxkU9OfC5DWUc4QyyHo-fZk03yYkE9Qd1p5Yd4djpn3EXT5murVJPXoCzSQH98A3L873IOUf-rus-b7fXWEWqZ_Bh1a3PUq3_2HQw1HR-asR6K5hpkowec48rYPXuK6z3MF1Nva8_tLdKTZxEy1RtcXH75ro-W47vFaOLXXRf8F9DmhU-kSqVmcUrBuDExiRVlkeBCMK6MVOF6rZnEDGS84JoRC-FldMZjm87PmaTkBHXyTW5OkReBxEjBhF8bmlGGgexgHIqMhooIWKMaoJ6lSvJeAWskNUEG6LqhewLMbiMYKjebcpfAlDE4SPjvZ0CIUfDK2Nnvrz9HB3azq3yvC9QptqW5BMuiSK8cS30CdlnHrA |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEJ4gHvSECkZ81sSjhT720T0TCSoQImC4NdvtkhgNGB4Xf70zpcVEjfHUZtN0u7PTeezMfANwY3UgjI87IFB_uEyaxNUJl25gU-Wlnko9k6F99kVnzB4mfFKC220tjLU2Sz6zDbrNYvnp3KzpqKwpCS5cyh3Y5VSMu6nWKrgn9CiClRs_JIdD9G2E2sYUAurHksU-RegK5as8xul7qtl6HjxRmhdv5FPkvVZ-COhM67Qr0Cu-d5Ns8tpYr5KG-fgG5fjfBR1A7au-zxlsNdchlOzsCCq5Qerkv_sSh4qeD8VYFYKhQenoIO9khewOneM63eGo5yzo5J6wnhzCyNRveMkyzJc1GLfvRq2Om_ddcF9Qn6_cUOlEGR4lDM0bG4WRZjyQQkoutFXan04NVx5HKS-F4SGBeFmTiogS-gVXLDyG8mw-syfgBCgzEjTip5aljHtIdjQPZcp8HUpco65DlagSv2-gNeKcIHW4LugeI7tTDEPP7Hy9jHHKCF0k7-9nUIwx9Mv46e-vv4K9zqjXjbv3_ccz2KeN32R_nUN5tVjbC7QzVsllxl6fDuPK9A |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2015+IEEE+Conference+on+Computer+Vision+and+Pattern+Recognition+%28CVPR%29&rft.atitle=Scene+labeling+with+LSTM+recurrent+neural+networks&rft.au=Wonmin+Byeon&rft.au=Breuel%2C+Thomas+M.&rft.au=Raue%2C+Federico&rft.au=Liwicki%2C+Marcus&rft.date=2015-06-01&rft.pub=IEEE&rft.issn=1063-6919&rft.eissn=1063-6919&rft.spage=3547&rft.epage=3555&rft_id=info:doi/10.1109%2FCVPR.2015.7298977&rft.externalDocID=7298977 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-6919&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-6919&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-6919&client=summon |