Recognizing Text with Perspective Distortion in Natural Scenes

This paper presents an approach to text recognition in natural scene images. Unlike most existing works which assume that texts are horizontal and frontal parallel to the image plane, our method is able to recognize perspective texts of arbitrary orientations. For individual character recognition, w...

Full description

Saved in:
Bibliographic Details
Published in2013 IEEE International Conference on Computer Vision pp. 569 - 576
Main Authors Trung Quy Phan, Shivakumara, Palaiahnakote, Shangxuan Tian, Chew Lim Tan
Format Conference Proceeding Journal Article
LanguageEnglish
Published IEEE 01.12.2013
Subjects
Online AccessGet full text
ISSN1550-5499
DOI10.1109/ICCV.2013.76

Cover

Loading…
Abstract This paper presents an approach to text recognition in natural scene images. Unlike most existing works which assume that texts are horizontal and frontal parallel to the image plane, our method is able to recognize perspective texts of arbitrary orientations. For individual character recognition, we adopt a bag-of-key points approach, in which Scale Invariant Feature Transform (SIFT) descriptors are extracted densely and quantized using a pre-trained vocabulary. Following [1, 2], the context information is utilized through lexicons. We formulate word recognition as finding the optimal alignment between the set of characters and the list of lexicon words. Furthermore, we introduce a new dataset called StreetViewText-Perspective, which contains texts in street images with a great variety of viewpoints. Experimental results on public datasets and the proposed dataset show that our method significantly outperforms the state-of-the-art on perspective texts of arbitrary orientations.
AbstractList This paper presents an approach to text recognition in natural scene images. Unlike most existing works which assume that texts are horizontal and frontal parallel to the image plane, our method is able to recognize perspective texts of arbitrary orientations. For individual character recognition, we adopt a bag-of-key points approach, in which Scale Invariant Feature Transform (SIFT) descriptors are extracted densely and quantized using a pre-trained vocabulary. Following [1, 2], the context information is utilized through lexicons. We formulate word recognition as finding the optimal alignment between the set of characters and the list of lexicon words. Furthermore, we introduce a new dataset called StreetViewText-Perspective, which contains texts in street images with a great variety of viewpoints. Experimental results on public datasets and the proposed dataset show that our method significantly outperforms the state-of-the-art on perspective texts of arbitrary orientations.
Author Trung Quy Phan
Chew Lim Tan
Shangxuan Tian
Shivakumara, Palaiahnakote
Author_xml – sequence: 1
  surname: Trung Quy Phan
  fullname: Trung Quy Phan
  email: phanquyt@comp.nus.edu.sg
  organization: Sch. of Comput., Nat. Univ. of Singapore, Singapore, Singapore
– sequence: 2
  givenname: Palaiahnakote
  surname: Shivakumara
  fullname: Shivakumara, Palaiahnakote
  email: shiva@um.edu.my
– sequence: 3
  surname: Shangxuan Tian
  fullname: Shangxuan Tian
  email: tians@comp.nus.edu.sg
  organization: Sch. of Comput., Nat. Univ. of Singapore, Singapore, Singapore
– sequence: 4
  surname: Chew Lim Tan
  fullname: Chew Lim Tan
  email: tancl@comp.nus.edu.sg
  organization: Sch. of Comput., Nat. Univ. of Singapore, Singapore, Singapore
BookMark eNotjDtPwzAYAI1UJNrCxsbikSXFn-34sSChUKBSBQgKa5S4X4pR6pTY5fXrqVSmG-50IzIIXUBCToFNAJi9mBXF64QzEBOtDsgIpLaWG8n4gAwhz1mWS2uPyCjGd8bETqkhuXxC162C__VhRRf4neiXT2_0Efu4QZf8J9JrH1PXJ98F6gO9r9K2r1r67DBgPCaHTdVGPPnnmLzcTBfFXTZ_uJ0VV_PMc6lShtIulwzQNsCRc6tAGECn6ybXzOTWaeNkLRlIyQXAsmlMbZVWtUODVS3FmJzvv5u--9hiTOXaR4dtWwXstrEEpaxRRkizS8_2qUfEctP7ddX_lErnAIaJPwKyVr0
CODEN IEEPAD
ContentType Conference Proceeding
Journal Article
DBID 6IE
6IH
CBEJK
RIE
RIO
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/ICCV.2013.76
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Computer and Information Systems Abstracts
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore Digital Library
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 1479928402
9781479928408
EndPage 576
ExternalDocumentID 6751180
Genre orig-research
GroupedDBID 29O
6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ACGFS
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IPLJI
M43
OCL
RIE
RIL
RIO
RNS
7SC
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-i246t-e49dd01e9f12e22961381ec7bf570859c78c4b401442311dff8b9676bce8eab43
IEDL.DBID RIE
ISSN 1550-5499
IngestDate Thu Jul 10 21:15:17 EDT 2025
Wed Aug 27 04:21:35 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i246t-e49dd01e9f12e22961381ec7bf570859c78c4b401442311dff8b9676bce8eab43
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Conference-1
ObjectType-Feature-3
content type line 23
SourceType-Conference Papers & Proceedings-2
OpenAccessLink http://scholarbank.nus.edu.sg/handle/10635/78316
PQID 1669868348
PQPubID 23500
PageCount 8
ParticipantIDs proquest_miscellaneous_1669868348
ieee_primary_6751180
PublicationCentury 2000
PublicationDate 20131201
PublicationDateYYYYMMDD 2013-12-01
PublicationDate_xml – month: 12
  year: 2013
  text: 20131201
  day: 01
PublicationDecade 2010
PublicationTitle 2013 IEEE International Conference on Computer Vision
PublicationTitleAbbrev iccv
PublicationYear 2013
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0039286
ssj0001967680
Score 2.5916958
Snippet This paper presents an approach to text recognition in natural scene images. Unlike most existing works which assume that texts are horizontal and frontal...
SourceID proquest
ieee
SourceType Aggregation Database
Publisher
StartPage 569
SubjectTerms Accuracy
Character recognition
Computer vision
Context
Equations
Feature extraction
Image recognition
Invariants
Object recognition
Orientation
Recognition
Text recognition
Texts
Title Recognizing Text with Perspective Distortion in Natural Scenes
URI https://ieeexplore.ieee.org/document/6751180
https://www.proquest.com/docview/1669868348
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwELVKJ6YCLaJ8yUiMJE0cx04WlkJVkFpVqEXdItu5SBVSimiy9NfjS9JWAga2DElkne27d-d3z4Tcg8XISgY2Tc1E6PDM5472OHeElp5kXIZhir3Dk6kYL_jrMly2yMO-FwYAKvIZuPhYneWna1NiqWxgwS0qlh2RI5u41b1ah3pKLCxy9nZe2Ib96pZHROAO5kB70ns8eBkO35HUFbioNFJdqvLLE1fhZdQhk93AalbJh1sW2jXbH5qN_x35CekdGvnobB-iTkkL8jPSaZAnbfb1pkse32oe0da-RefWX1Osz9LZoROTPlV6IjiNdJXTqar0Ouwv0Ff2yGL0PB-OneZmBWfFuCgc4HGaej7Emc-AsdjG9MgHI3UWSlQ8MzIyXHPMtiz-89MsizTaVxuIQGkenJN2vs7hgtDANyB0KoEZznkGiuk4tDBMAVPKU7xPumiO5LMWz0gaS_TJ3c7giV3QeEqhcliXm8QXIo5EFPDo8u9Pr8gxTl7NKbkm7eKrhBuLDAp9Wy2JbwjutWM
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEJ4gHvSECkZ8rolHi31st-3FC0pAgRADhluz204TYlKMtBd-vTttgUQ9eOuhbTYzuzPfzM58A3CHGiNLz9FhaiJcgycWN5TJuSGUZ3o291w3pt7h0Vj0Z_xl7s5rcL_thUHEovgMO_RY3OXHyyinVNmDBrfEWLYH-9rvu1bZrbXLqARCY2dzY4e14y_mPBIGNygK2pa9Bw-DbvedyrqcDnGNFGNVftniwsH0GjDaLK2sK_no5JnqROsfrI3_XfsRtHatfGyydVLHUMP0BBoV9mTVyV414fGtrCRa67fYVFtsRhlaNtn1YrKnglGEFMkWKRvLgrFD_4KsZQtmvedpt29UsxWMhc1FZiAP4ti0MEgsG2070F7dtzDyVOJ6xHkWeX7EFad4SyNAK04SX5F8VYQ-SsWdU6inyxTPgDlWhELFHtoR5zxBaavA1eqRaEtpSt6GJokj_CzpM8JKEm243Qg81Fua7ilkist8FVpCBL7wHe6f__3pDRz0p6NhOByMXy_gkBRZVphcQj37yvFK44RMXRfb4xuvELis
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2013+IEEE+International+Conference+on+Computer+Vision&rft.atitle=Recognizing+Text+with+Perspective+Distortion+in+Natural+Scenes&rft.au=Trung+Quy+Phan&rft.au=Shivakumara%2C+Palaiahnakote&rft.au=Shangxuan+Tian&rft.au=Chew+Lim+Tan&rft.date=2013-12-01&rft.pub=IEEE&rft.issn=1550-5499&rft.spage=569&rft.epage=576&rft_id=info:doi/10.1109%2FICCV.2013.76&rft.externalDocID=6751180
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1550-5499&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1550-5499&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1550-5499&client=summon