Full Page Handwriting Recognition via Image to Sequence Extraction
We present a Neural Network based Handwritten Text Recognition (HTR) model architecture that can be trained to recognize full pages of handwritten or printed text without image segmentation. Being based on Image to Sequence architecture, it can extract text present in an image and then sequence it c...
Saved in:
Published in | arXiv.org |
---|---|
Main Authors | , |
Format | Paper Journal Article |
Language | English |
Published |
Ithaca
Cornell University Library, arXiv.org
26.06.2022
|
Subjects | |
Online Access | Get full text |
ISSN | 2331-8422 |
DOI | 10.48550/arxiv.2103.06450 |
Cover
Loading…
Abstract | We present a Neural Network based Handwritten Text Recognition (HTR) model architecture that can be trained to recognize full pages of handwritten or printed text without image segmentation. Being based on Image to Sequence architecture, it can extract text present in an image and then sequence it correctly without imposing any constraints regarding orientation, layout and size of text and non-text. Further, it can also be trained to generate auxiliary markup related to formatting, layout and content. We use character level vocabulary, thereby enabling language and terminology of any subject. The model achieves a new state-of-art in paragraph level recognition on the IAM dataset. When evaluated on scans of real world handwritten free form test answers - beset with curved and slanted lines, drawings, tables, math, chemistry and other symbols - it performs better than all commercially available HTR cloud APIs. It is deployed in production as part of a commercial web application. |
---|---|
AbstractList | We present a Neural Network based Handwritten Text Recognition (HTR) model
architecture that can be trained to recognize full pages of handwritten or
printed text without image segmentation. Being based on Image to Sequence
architecture, it can extract text present in an image and then sequence it
correctly without imposing any constraints regarding orientation, layout and
size of text and non-text. Further, it can also be trained to generate
auxiliary markup related to formatting, layout and content. We use character
level vocabulary, thereby enabling language and terminology of any subject. The
model achieves a new state-of-art in paragraph level recognition on the IAM
dataset. When evaluated on scans of real world handwritten free form test
answers - beset with curved and slanted lines, drawings, tables, math,
chemistry and other symbols - it performs better than all commercially
available HTR cloud APIs. It is deployed in production as part of a commercial
web application. We present a Neural Network based Handwritten Text Recognition (HTR) model architecture that can be trained to recognize full pages of handwritten or printed text without image segmentation. Being based on Image to Sequence architecture, it can extract text present in an image and then sequence it correctly without imposing any constraints regarding orientation, layout and size of text and non-text. Further, it can also be trained to generate auxiliary markup related to formatting, layout and content. We use character level vocabulary, thereby enabling language and terminology of any subject. The model achieves a new state-of-art in paragraph level recognition on the IAM dataset. When evaluated on scans of real world handwritten free form test answers - beset with curved and slanted lines, drawings, tables, math, chemistry and other symbols - it performs better than all commercially available HTR cloud APIs. It is deployed in production as part of a commercial web application. |
Author | Singh, Sumeet S Karayev, Sergey |
Author_xml | – sequence: 1 givenname: Sumeet surname: Singh middlename: S fullname: Singh, Sumeet S – sequence: 2 givenname: Sergey surname: Karayev fullname: Karayev, Sergey |
BackLink | https://doi.org/10.48550/arXiv.2103.06450$$DView paper in arXiv https://doi.org/10.1007/978-3-030-86334-0_4$$DView published paper (Access to full text may be restricted) |
BookMark | eNotj1FPwjAUhRujiYj8AJ9s4vPwtre3o49KQEhINMr7Uka7jIwOu4H47x3g0zkPX07Od8euQx0cYw8ChmpEBM82HsvDUArAIWhFcMV6ElEkIyXlLRs0zQYApE4lEfbY63RfVfzDFo7PbFj_xLItQ8E_XV4Xoet14IfS8vn2RLQ1_3LfexdyxyfHNtr8BNyzG2-rxg3-s8-W08lyPEsW72_z8csisSRl4lE5hJEho7y1qVCQp56sw9ystVbCgAZvNOkUjROEDgn9yqw0iFQa4bHPHi-zZ8FsF8utjb_ZSTQ7i3bE04XYxbp72bTZpt7H0H3KJAGkQIYk_gEIaVWz |
ContentType | Paper Journal Article |
Copyright | 2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. http://creativecommons.org/licenses/by/4.0 |
Copyright_xml | – notice: 2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. – notice: http://creativecommons.org/licenses/by/4.0 |
DBID | 8FE 8FG ABJCF ABUWG AFKRA AZQEC BENPR BGLVJ CCPQU DWQXO HCIFZ L6V M7S PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS PTHSS AKY GOX |
DOI | 10.48550/arxiv.2103.06450 |
DatabaseName | ProQuest SciTech Collection ProQuest Technology Collection Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest Central UK/Ireland ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One ProQuest Central SciTech Premium Collection ProQuest Engineering Collection Engineering Database ProQuest Central Premium ProQuest One Academic Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China Engineering Collection arXiv Computer Science arXiv.org |
DatabaseTitle | Publicly Available Content Database Engineering Database Technology Collection ProQuest One Academic Middle East (New) ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Technology Collection ProQuest SciTech Collection ProQuest Central China ProQuest Central ProQuest One Applied & Life Sciences ProQuest Engineering Collection ProQuest One Academic UKI Edition ProQuest Central Korea Materials Science & Engineering Collection ProQuest Central (New) ProQuest One Academic ProQuest One Academic (New) Engineering Collection |
DatabaseTitleList | Publicly Available Content Database |
Database_xml | – sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository – sequence: 2 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Physics |
EISSN | 2331-8422 |
ExternalDocumentID | 2103_06450 |
Genre | Working Paper/Pre-Print |
GroupedDBID | 8FE 8FG ABJCF ABUWG AFKRA ALMA_UNASSIGNED_HOLDINGS AZQEC BENPR BGLVJ CCPQU DWQXO FRJ HCIFZ L6V M7S M~E PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS PTHSS AKY GOX |
ID | FETCH-LOGICAL-a522-f34e3089594faa7140c7f5ae3c9d66419060f9656739e153e353fb9b6017291f3 |
IEDL.DBID | GOX |
IngestDate | Tue Jul 22 23:40:32 EDT 2025 Mon Jun 30 09:12:56 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-a522-f34e3089594faa7140c7f5ae3c9d66419060f9656739e153e353fb9b6017291f3 |
Notes | SourceType-Working Papers-1 ObjectType-Working Paper/Pre-Print-1 content type line 50 |
OpenAccessLink | https://arxiv.org/abs/2103.06450 |
PQID | 2500705952 |
PQPubID | 2050157 |
ParticipantIDs | arxiv_primary_2103_06450 proquest_journals_2500705952 |
PublicationCentury | 2000 |
PublicationDate | 20220626 |
PublicationDateYYYYMMDD | 2022-06-26 |
PublicationDate_xml | – month: 06 year: 2022 text: 20220626 day: 26 |
PublicationDecade | 2020 |
PublicationPlace | Ithaca |
PublicationPlace_xml | – name: Ithaca |
PublicationTitle | arXiv.org |
PublicationYear | 2022 |
Publisher | Cornell University Library, arXiv.org |
Publisher_xml | – name: Cornell University Library, arXiv.org |
SSID | ssj0002672553 |
Score | 1.8012686 |
SecondaryResourceType | preprint |
Snippet | We present a Neural Network based Handwritten Text Recognition (HTR) model architecture that can be trained to recognize full pages of handwritten or printed... We present a Neural Network based Handwritten Text Recognition (HTR) model architecture that can be trained to recognize full pages of handwritten or printed... |
SourceID | arxiv proquest |
SourceType | Open Access Repository Aggregation Database |
SubjectTerms | Applications programs Computer Science - Artificial Intelligence Computer Science - Computation and Language Computer Science - Computer Vision and Pattern Recognition Computer Science - Learning Datasets Free form Handwriting recognition Image segmentation Layouts Neural networks Object recognition Printed text |
SummonAdditionalLinks | – databaseName: ProQuest Technology Collection dbid: 8FG link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV07T8MwELagFRIbT7VQkAdWt2kc2_GEBGopSKAKitQtshMbdSAJaSj9-ZzdFAYkVsdL7s7fPXz-DqGrDE4BZUoRowQkKCw0JE55QCAYj2hmbah9aeDxiU9eo4c5mzcFt2XTVrnFRA_UWZG6GvkAXDVYJ5MsvC4_iJsa5W5XmxEau6g9BE_jLDwe3_3UWEIuIGKmm8tMT901UNV6sepDnkP7jqkNkLjtl_5Asfcv4wPUnqrSVIdox-RHaM-3ZabLY3TjMkQ8hTOPJ5DyfzkGovwNP2-7foocrxYK37-7HXWBX5rGaDxa19XmycIJmo1Hs9sJaaYeEAWxELE0MjSIJZORVcrR6aXCMmVoKjPOI_DfPLASojBBpQG4MpRRq6XmLpmTQ0tPUSsvctNBWOhsqDOjrI6jiIFCAsG148cxIEeRqS7q-H9Pyg2xReLEknixdFFvK46kMepl8quCs_8_n6P90L0SCDgJeQ-16urTXIDvrvWlV9A3g-2W2A priority: 102 providerName: ProQuest |
Title | Full Page Handwriting Recognition via Image to Sequence Extraction |
URI | https://www.proquest.com/docview/2500705952 https://arxiv.org/abs/2103.06450 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV1BT8IwFH4BvHgxGjWgSHrwOh3r2q1HMQM0AQliwm1pWWs4OAhM5ORv97Ub8WC89NC8Hvq61_d97es3gNsMo4AyKT0tIyQoLNBevOC-h2A8pJkxgXJHA6MxH76Fz3M2rwE5vIWRm_1yV-oDq-098hF6ZxXVkJTXg8CWbA1e5uXlpJPiqux_7RBjuq4_W6vLF_1TOKmAHnkoV-YMajo_h55lfGSCMUyGSOG_rKJQ_k6mhyqeVU52S0mePqxFsSKvVaEzSfbFpnyCcAGzfjJ7HHrVXww8idjGMzTU1I8FE6GR0srjLSLDpKYLkXEeYj7mvhGIqiIqNG4_mjJqlFDckjPRNfQSGvkq100gkcq6KtPSqDgMGTrYj7iyejcaMVKUyRY03dzTdSlUkVq3pM4tLWgf3JFWH-k2RfSDAc8EC67-H3kNx4Gt-Pe5F_A2NIrNp77BPFyoDtTj_qADR71kPJl23NJgO_pOfgArjInj |
linkProvider | Cornell University |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT8JAEJ4gxOjNZ0RR96DHQul2t_RgTFAIyCMEMeHWbNtdw8GCgIA_yv_obB96MPHGtW026czsN4-d-RbgJsRdQJkQhhQOJijMkkYt4KaBwbhNQ6UsPy4N9Pq89WI_jdk4B1_ZLIxuq8wwMQbqcBroGnkFXTVaJ3OZdT97N_StUfp0NbtCIzGLjvxcY8q2uGs_on5vLavZGD20jPRWAUNgrGEoaktq1lzm2koITVcXOIoJSQM35NxG_8hN5WKU41BXIhxIyqjyXZ_rZMmtKorL7kDBphgp5KFQb_QHw5-ijsUdDNFpcnoac4VVxHwzWZUxsaJlTQ2H0F-IH_3B_tihNQ-gMBAzOT-EnIyOYDfuAw0Wx1DXKSkZIMiQlojCtaY8il7JMGszmkZkNRGk_aa_WE7Jc9qJTRqb5TyZkTiB0TYEcgr5aBrJMyCOH1b9UArl12yboQWYDvc1IY9ExTmhKMJZ_O_eLGHS8LRYvFgsRShl4vDSXbTwfnV-_v_ra9hrjXpdr9vudy5g39IjCiY3LF6C_HL-IS8xcFj6V6m6CHhbNpBvjpXR3w |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Full+Page+Handwriting+Recognition+via+Image+to+Sequence+Extraction&rft.jtitle=arXiv.org&rft.au=Singh%2C+Sumeet+S&rft.au=Karayev%2C+Sergey&rft.date=2022-06-26&rft.pub=Cornell+University+Library%2C+arXiv.org&rft.eissn=2331-8422&rft_id=info:doi/10.48550%2Farxiv.2103.06450 |