Scene labeling with LSTM recurrent neural networks

This paper addresses the problem of pixel-level segmentation and classification of scene images with an entirely learning-based approach using Long Short Term Memory (LSTM) recurrent neural networks, which are commonly used for sequence classification. We investigate two-dimensional (2D) LSTM networ...

Full description

Saved in:
Bibliographic Details
Published in2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 3547 - 3555
Main Authors Wonmin Byeon, Breuel, Thomas M., Raue, Federico, Liwicki, Marcus
Format Conference Proceeding Journal Article
LanguageEnglish
Published IEEE 01.06.2015
Subjects
Online AccessGet full text
ISSN1063-6919
1063-6919
2575-7075
DOI10.1109/CVPR.2015.7298977

Cover

Loading…
Abstract This paper addresses the problem of pixel-level segmentation and classification of scene images with an entirely learning-based approach using Long Short Term Memory (LSTM) recurrent neural networks, which are commonly used for sequence classification. We investigate two-dimensional (2D) LSTM networks for natural scene images taking into account the complex spatial dependencies of labels. Prior methods generally have required separate classification and image segmentation stages and/or pre- and post-processing. In our approach, classification, segmentation, and context integration are all carried out by 2D LSTM networks, allowing texture and spatial model parameters to be learned within a single model. The networks efficiently capture local and global contextual information over raw RGB values and adapt well for complex scene images. Our approach, which has a much lower computational complexity than prior methods, achieved state-of-the-art performance over the Stanford Background and the SIFT Flow datasets. In fact, if no pre- or post-processing is applied, LSTM networks outperform other state-of-the-art approaches. Hence, only with a single-core Central Processing Unit (CPU), the running time of our approach is equivalent or better than the compared state-of-the-art approaches which use a Graphics Processing Unit (GPU). Finally, our networks' ability to visualize feature maps from each layer supports the hypothesis that LSTM networks are overall suited for image processing tasks.
AbstractList This paper addresses the problem of pixel-level segmentation and classification of scene images with an entirely learning-based approach using Long Short Term Memory (LSTM) recurrent neural networks, which are commonly used for sequence classification. We investigate two-dimensional (2D) LSTM networks for natural scene images taking into account the complex spatial dependencies of labels. Prior methods generally have required separate classification and image segmentation stages and/or pre- and post-processing. In our approach, classification, segmentation, and context integration are all carried out by 2D LSTM networks, allowing texture and spatial model parameters to be learned within a single model. The networks efficiently capture local and global contextual information over raw RGB values and adapt well for complex scene images. Our approach, which has a much lower computational complexity than prior methods, achieved state-of-the-art performance over the Stanford Background and the SIFT Flow datasets. In fact, if no pre- or post-processing is applied, LSTM networks outperform other state-of-the-art approaches. Hence, only with a single-core Central Processing Unit (CPU), the running time of our approach is equivalent or better than the compared state-of-the-art approaches which use a Graphics Processing Unit (GPU). Finally, our networks' ability to visualize feature maps from each layer supports the hypothesis that LSTM networks are overall suited for image processing tasks.
Author Wonmin Byeon
Liwicki, Marcus
Breuel, Thomas M.
Raue, Federico
Author_xml – sequence: 1
  surname: Wonmin Byeon
  fullname: Wonmin Byeon
  email: wonmin.byeon@dfki.de
  organization: Univ. of Kaiserslautern, Kaiserslautern, Germany
– sequence: 2
  givenname: Thomas M.
  surname: Breuel
  fullname: Breuel, Thomas M.
  email: tmb@cs.uni-kl.de
  organization: Univ. of Kaiserslautern, Kaiserslautern, Germany
– sequence: 3
  givenname: Federico
  surname: Raue
  fullname: Raue, Federico
  email: federico.raue@dfki.de
  organization: Univ. of Kaiserslautern, Kaiserslautern, Germany
– sequence: 4
  givenname: Marcus
  surname: Liwicki
  fullname: Liwicki, Marcus
  email: liwicki@cs.uni-kl.de
  organization: Univ. of Kaiserslautern, Kaiserslautern, Germany
BookMark eNqNkL1OwzAURg0qEqX0ARBLRpYUO7bvtUdU8ScVgWhhjRz3FizSpNiJKt6eSu3AyHS-4egM3xkbNG1DjF0IPhGC2-vp-8vrpOBCT7CwxiIesTOhACVYUPyYDQUHmYMVdvBnn7JxSqHiknNjbcGHrJh7aiirXUV1aD6ybeg-s9l88ZRF8n2M1HRZQ3109Q7dto1f6ZydrFydaHzgiL3d3S6mD_ns-f5xejPLQ6FEl0vrKuu1qRQIJCONU7pAQNTgyDqxWnltuaZKIHgttUZOfgnGuyWBtkqO2NW-u4ntd0-pK9cheapr11Dbp1IgcqkAQP9DBaMROOJOvdyrgYjKTQxrF3_Kw4nyFzzsY28
ContentType Conference Proceeding
Journal Article
DBID 6IE
6IH
CBEJK
RIE
RIO
7QO
8FD
FR3
P64
7SC
JQ2
L7M
L~C
L~D
DOI 10.1109/CVPR.2015.7298977
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
Biotechnology Research Abstracts
Technology Research Database
Engineering Research Database
Biotechnology and BioEngineering Abstracts
Computer and Information Systems Abstracts
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle Engineering Research Database
Biotechnology Research Abstracts
Technology Research Database
Biotechnology and BioEngineering Abstracts
Computer and Information Systems Abstracts
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList Computer and Information Systems Abstracts

Engineering Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
Computer Science
EISBN 1467369640
9781467369640
EISSN 1063-6919
2575-7075
EndPage 3555
ExternalDocumentID 7298977
Genre orig-research
GroupedDBID 23M
29F
29O
6IE
6IH
6IK
ABDPE
ACGFS
ALMA_UNASSIGNED_HOLDINGS
CBEJK
IPLJI
M43
RIE
RIO
RNS
7QO
8FD
FR3
P64
7SC
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-i241t-39ab9c58b4617e838a452767756ae9a1ffc5905eb176c535570ecd68cade65943
IEDL.DBID RIE
ISSN 1063-6919
IngestDate Thu Jul 10 17:51:15 EDT 2025
Tue Aug 05 09:59:06 EDT 2025
Wed Aug 27 02:49:18 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i241t-39ab9c58b4617e838a452767756ae9a1ffc5905eb176c535570ecd68cade65943
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Conference-1
ObjectType-Feature-3
content type line 23
SourceType-Conference Papers & Proceedings-2
PQID 1768576077
PQPubID 23462
PageCount 9
ParticipantIDs ieee_primary_7298977
proquest_miscellaneous_1768576077
proquest_miscellaneous_1770346665
PublicationCentury 2000
PublicationDate 20150601
PublicationDateYYYYMMDD 2015-06-01
PublicationDate_xml – month: 06
  year: 2015
  text: 20150601
  day: 01
PublicationDecade 2010
PublicationTitle 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
PublicationTitleAbbrev CVPR
PublicationYear 2015
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib030089920
ssj0023720
ssj0003211698
Score 2.4200058
Snippet This paper addresses the problem of pixel-level segmentation and classification of scene images with an entirely learning-based approach using Long Short Term...
SourceID proquest
ieee
SourceType Aggregation Database
Publisher
StartPage 3547
SubjectTerms Accuracy
Classification
Feedforward neural networks
Networks
Recurrent neural networks
Roads
Segmentation
State of the art
Texture
Two dimensional
Weaving
Title Scene labeling with LSTM recurrent neural networks
URI https://ieeexplore.ieee.org/document/7298977
https://www.proquest.com/docview/1768576077
https://www.proquest.com/docview/1770346665
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3dS8MwEA_bnnyauonziwo-2i5tvprn4RjiZLhN9lbSNANROtnaF_96L2k7QUV8agmlaS7X-8jd_Q6hG6MirkPYAQ76w6dCp75KmfAjk0mcYZlh7dA-H_lkSe9XbNVCt_taGGOMSz4zgb11sfxso0t7VDYUFi5ciDZqg-NW1Wo1vEOwjV_Vpo-VwgQ8Gy73EYXIdmNxkU9OfC5DWUc4QyyHo-fZk03yYkE9Qd1p5Yd4djpn3EXT5murVJPXoCzSQH98A3L873IOUf-rus-b7fXWEWqZ_Bh1a3PUq3_2HQw1HR-asR6K5hpkowec48rYPXuK6z3MF1Nva8_tLdKTZxEy1RtcXH75ro-W47vFaOLXXRf8F9DmhU-kSqVmcUrBuDExiRVlkeBCMK6MVOF6rZnEDGS84JoRC-FldMZjm87PmaTkBHXyTW5OkReBxEjBhF8bmlGGgexgHIqMhooIWKMaoJ6lSvJeAWskNUEG6LqhewLMbiMYKjebcpfAlDE4SPjvZ0CIUfDK2Nnvrz9HB3azq3yvC9QptqW5BMuiSK8cS30CdlnHrA
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEJ4gHvSECkZ81sSjhT720T0TCSoQImC4NdvtkhgNGB4Xf70zpcVEjfHUZtN0u7PTeezMfANwY3UgjI87IFB_uEyaxNUJl25gU-Wlnko9k6F99kVnzB4mfFKC220tjLU2Sz6zDbrNYvnp3KzpqKwpCS5cyh3Y5VSMu6nWKrgn9CiClRs_JIdD9G2E2sYUAurHksU-RegK5as8xul7qtl6HjxRmhdv5FPkvVZ-COhM67Qr0Cu-d5Ns8tpYr5KG-fgG5fjfBR1A7au-zxlsNdchlOzsCCq5Qerkv_sSh4qeD8VYFYKhQenoIO9khewOneM63eGo5yzo5J6wnhzCyNRveMkyzJc1GLfvRq2Om_ddcF9Qn6_cUOlEGR4lDM0bG4WRZjyQQkoutFXan04NVx5HKS-F4SGBeFmTiogS-gVXLDyG8mw-syfgBCgzEjTip5aljHtIdjQPZcp8HUpco65DlagSv2-gNeKcIHW4LugeI7tTDEPP7Hy9jHHKCF0k7-9nUIwx9Mv46e-vv4K9zqjXjbv3_ccz2KeN32R_nUN5tVjbC7QzVsllxl6fDuPK9A
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2015+IEEE+Conference+on+Computer+Vision+and+Pattern+Recognition+%28CVPR%29&rft.atitle=Scene+labeling+with+LSTM+recurrent+neural+networks&rft.au=Wonmin+Byeon&rft.au=Breuel%2C+Thomas+M.&rft.au=Raue%2C+Federico&rft.au=Liwicki%2C+Marcus&rft.date=2015-06-01&rft.pub=IEEE&rft.issn=1063-6919&rft.eissn=1063-6919&rft.spage=3547&rft.epage=3555&rft_id=info:doi/10.1109%2FCVPR.2015.7298977&rft.externalDocID=7298977
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-6919&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-6919&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-6919&client=summon