Accurate Fine-Grained Layout Analysis for the Historical Tibetan Document Based on the Instance Segmentation
Accurate layout analysis without subsequent text-line segmentation remains an ongoing challenge, especially when facing the Kangyur, a kind of historical Tibetan document featuring considerable touching components and mottled background. Aiming at identifying different regions in document images, la...
Saved in:
Published in | IEEE access Vol. 9; pp. 154435 - 154447 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Accurate layout analysis without subsequent text-line segmentation remains an ongoing challenge, especially when facing the Kangyur, a kind of historical Tibetan document featuring considerable touching components and mottled background. Aiming at identifying different regions in document images, layout analysis is indispensable for subsequent procedures such as character recognition. However, there was only a little research being carried out to perform line-level layout analysis which failed to deal with the Kangyur. To obtain the optimal results, a fine-grained sub-line level layout analysis approach is presented. Firstly, we introduced an accelerated method to build the dataset which is dynamic and reliable. Secondly, enhancement had been made to the SOLOv2 according to the characteristics of the Kangyur. Then, we fed the enhanced SOLOv2 with the prepared annotation file during the training phase. Once the network is trained, instances of the text line, sentence, and titles can be segmented and identified during the inference stage. The experimental results show that the proposed method delivers a decent 72.7% average precision on our dataset. In general, this preliminary research provides insights into the fine-grained sub-line level layout analysis and testifies the SOLOv2-based approaches. We also believe that the proposed methods can be adopted on other language documents with various layouts. |
---|---|
AbstractList | Accurate layout analysis without subsequent text-line segmentation remains an ongoing challenge, especially when facing the Kangyur, a kind of historical Tibetan document featuring considerable touching components and mottled background. Aiming at identifying different regions in document images, layout analysis is indispensable for subsequent procedures such as character recognition. However, there was only a little research being carried out to perform line-level layout analysis which failed to deal with the Kangyur. To obtain the optimal results, a fine-grained sub-line level layout analysis approach is presented. Firstly, we introduced an accelerated method to build the dataset which is dynamic and reliable. Secondly, enhancement had been made to the SOLOv2 according to the characteristics of the Kangyur. Then, we fed the enhanced SOLOv2 with the prepared annotation file during the training phase. Once the network is trained, instances of the text line, sentence, and titles can be segmented and identified during the inference stage. The experimental results show that the proposed method delivers a decent 72.7% average precision on our dataset. In general, this preliminary research provides insights into the fine-grained sub-line level layout analysis and testifies the SOLOv2-based approaches. We also believe that the proposed methods can be adopted on other language documents with various layouts. |
Author | Cai, Zhengqi Zhang, Guowei Wang, Weilan Zhao, Penghai Lu, Yuqi |
Author_xml | – sequence: 1 givenname: Penghai orcidid: 0000-0002-9613-5318 surname: Zhao fullname: Zhao, Penghai organization: Key Laboratory of China's Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou, China – sequence: 2 givenname: Weilan orcidid: 0000-0003-2935-6601 surname: Wang fullname: Wang, Weilan email: wangweilan@xbmu.edu.cn organization: Key Laboratory of China's Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou, China – sequence: 3 givenname: Zhengqi surname: Cai fullname: Cai, Zhengqi organization: School of Mathematics and Computer Science, Northwest Minzu University, Lanzhou, China – sequence: 4 givenname: Guowei orcidid: 0000-0002-6656-3960 surname: Zhang fullname: Zhang, Guowei organization: Key Laboratory of China's Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou, China – sequence: 5 givenname: Yuqi surname: Lu fullname: Lu, Yuqi organization: Key Laboratory of China's Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou, China |
BookMark | eNpNUU1vGyEQRVUqNU3zC3JB6nkdPhYWjq6bD0uWenB6RmMWUqwNpMAe_O-Ls1HUuczozXtvpHlf0UVM0SF0Q8mKUqJv15vN3X6_YoTRFadMCS4_oUtGpe54my_-m7-g61KOpJVqkBgu0bS2ds5QHb4P0XUPGVob8Q5Oaa54HWE6lVCwTxnXPw4_hlJTDhYm_BQOrkLEP5OdX1ys-AeUpkzxjbiNpS2tw3v3fN5CDSl-Q589TMVdv_cr9Pv-7mnz2O1-PWw3611ne6JqZ0c3jlZK4NIx4TlVhOjeQ69lr-RBKK1h4JwMxHupDo0gFPXApBqEBWL5FdouvmOCo3nN4QXyySQI5g1I-dlArsFOzhwIH9ToRz4w3msCWlvRAwUrHBkHyZrX98XrNae_syvVHNOc21-KYZJQ2feSycbiC8vmVEp2_uMqJeackllSMueUzHtKTXWzqIJz7kOhJRVUCP4POZmO_Q |
CODEN | IAECCG |
CitedBy_id | crossref_primary_10_1186_s40494_023_01125_w crossref_primary_10_1109_ACCESS_2022_3151886 crossref_primary_10_3390_electronics11233919 crossref_primary_10_1007_s00371_023_02850_w crossref_primary_10_3233_JCM_226167 crossref_primary_10_3390_jimaging10030065 crossref_primary_10_11834_jig_240015 |
Cites_doi | 10.1007/978-3-030-03338-5_7 10.1109/ICDAR.2017.50 10.1109/ICDAR.2019.00088 10.1109/ICDAR.2013.154 10.1007/978-3-030-01264-9_5 10.1109/ICDAR.2019.00166 10.1109/ICCV.2019.00925 10.1109/TPAMI.2017.2699184 10.1109/LSP.2014.2325940 10.1109/CVPR.2017.462 10.1016/j.ipm.2021.102689 10.1109/CVPR.2016.90 10.1109/ICDAR.2015.7333877 10.1109/ICMLA.2019.00223 10.1109/ICDAR.2019.00066 10.1007/s11263-007-0090-8 10.1109/ICCV.2017.324 10.1109/ACCESS.2020.2975023 10.1109/3DV.2016.79 10.1109/TPAMI.2016.2646371 10.1109/CVPR.2017.106 10.1109/CVPR.2019.00657 10.1109/ICCV.2017.322 10.1109/CVPR.2017.634 10.1109/ICDAR.2017.161 10.1109/34.31447 10.1109/CVPR.2015.7298965 10.1007/978-3-030-58523-5_38 10.1109/DAS.2016.29 10.1109/ICDAR.2017.94 10.1109/ICDAR.2019.00164 10.1016/0734-189X(85)90016-7 10.1109/ICFHR.2016.0093 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021 |
DBID | 97E ESBDL RIA RIE AAYXX CITATION 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D DOA |
DOI | 10.1109/ACCESS.2021.3128536 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005-present IEEE Open Access Journals IEEE All-Society Periodicals Package (ASPP) 1998-Present IEEE Electronic Library Online CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Engineered Materials Abstracts METADEX Technology Research Database Materials Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Directory of Open Access Journals |
DatabaseTitle | CrossRef Materials Research Database Engineered Materials Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace METADEX Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Materials Research Database |
Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: RIE name: IEEE Electronic Library Online url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 2169-3536 |
EndPage | 154447 |
ExternalDocumentID | oai_doaj_org_article_b0378dfd3723490a99c54a1ac5e0d762 10_1109_ACCESS_2021_3128536 9615155 |
Genre | orig-research |
GrantInformation_xml | – fundername: Program for Leading Talent of State Ethnic Affairs Commission – fundername: Program for Innovative Research Team of State Ethnic Affairs Commission (SEAC) ([2018]98) – fundername: National Natural Science Foundation of China grantid: 61772430; 62166036 funderid: 10.13039/501100001809 – fundername: Postgraduate Support Programs of Northwest Minzu University’s Fundamental Research Funds for the Central Universities grantid: Ymx2021002 funderid: 10.13039/501100012226 – fundername: Natural Science Foundation of Gansu Province of China grantid: 21JR1RA195 funderid: 10.13039/501100004775 – fundername: Gansu Provincial First-Class Discipline Program of Northwest Minzu University grantid: 11080305 funderid: 10.13039/501100017619 |
GroupedDBID | 0R~ 4.4 5VS 6IK 97E AAJGR ABVLG ACGFS ADBBV ALMA_UNASSIGNED_HOLDINGS BCNDV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD ESBDL GROUPED_DOAJ IFIPE IPLJI JAVBF KQ8 M43 M~E O9- OCL OK1 RIA RIE RIG RNS AAYXX CITATION 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c408t-cdeddc66a36e25f3180094fa496486b5899a733070ff68bf31581fa26875ca0c3 |
IEDL.DBID | DOA |
ISSN | 2169-3536 |
IngestDate | Tue Oct 22 15:16:48 EDT 2024 Thu Oct 10 17:18:58 EDT 2024 Fri Aug 23 00:57:38 EDT 2024 Wed Jun 26 19:25:25 EDT 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c408t-cdeddc66a36e25f3180094fa496486b5899a733070ff68bf31581fa26875ca0c3 |
ORCID | 0000-0002-6656-3960 0000-0002-9613-5318 0000-0003-2935-6601 |
OpenAccessLink | https://doaj.org/article/b0378dfd3723490a99c54a1ac5e0d762 |
PQID | 2601644626 |
PQPubID | 4845423 |
PageCount | 13 |
ParticipantIDs | proquest_journals_2601644626 crossref_primary_10_1109_ACCESS_2021_3128536 ieee_primary_9615155 doaj_primary_oai_doaj_org_article_b0378dfd3723490a99c54a1ac5e0d762 |
PublicationCentury | 2000 |
PublicationDate | 20210000 2021-00-00 20210101 2021-01-01 |
PublicationDateYYYYMMDD | 2021-01-01 |
PublicationDate_xml | – year: 2021 text: 20210000 |
PublicationDecade | 2020 |
PublicationPlace | Piscataway |
PublicationPlace_xml | – name: Piscataway |
PublicationTitle | IEEE access |
PublicationTitleAbbrev | Access |
PublicationYear | 2021 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref13 ref34 ref12 jia (ref31) 2016; 29 ref15 ref14 ref33 ref11 ref10 ref2 ref1 ref39 ref17 ref38 ref16 ref19 (ref35) 2021 ref18 ref24 ref23 ref26 ref25 ref20 (ref32) 2021 ref41 ref22 ref21 ref27 ref29 ref8 chen (ref37) 2019 ref4 ref3 (ref36) 2021 ref6 jaderberg (ref30) 2015; 28 ref5 ryu (ref9) 2014; 21 wang (ref7) 2020 sun (ref28) 2019 ref40 |
References_xml | – ident: ref16 doi: 10.1007/978-3-030-03338-5_7 – ident: ref14 doi: 10.1109/ICDAR.2017.50 – year: 2019 ident: ref37 article-title: MMDetection: Open MMLab detection toolbox and benchmark publication-title: ArXiv 1906 07155 contributor: fullname: chen – ident: ref18 doi: 10.1109/ICDAR.2019.00088 – ident: ref8 doi: 10.1109/ICDAR.2013.154 – volume: 29 start-page: 667 year: 2016 ident: ref31 article-title: Dynamic filter networks publication-title: Proc Adv Neural Inf Process Syst contributor: fullname: jia – ident: ref2 doi: 10.1007/978-3-030-01264-9_5 – ident: ref21 doi: 10.1109/ICDAR.2019.00166 – ident: ref41 doi: 10.1109/ICCV.2019.00925 – ident: ref5 doi: 10.1109/TPAMI.2017.2699184 – volume: 21 start-page: 1115 year: 2014 ident: ref9 article-title: Language-independent text-line extraction algorithm for handwritten documents publication-title: IEEE Signal Process Lett doi: 10.1109/LSP.2014.2325940 contributor: fullname: ryu – ident: ref13 doi: 10.1109/CVPR.2017.462 – ident: ref17 doi: 10.1016/j.ipm.2021.102689 – ident: ref27 doi: 10.1109/CVPR.2016.90 – ident: ref10 doi: 10.1109/ICDAR.2015.7333877 – ident: ref19 doi: 10.1109/ICMLA.2019.00223 – ident: ref4 doi: 10.1109/ICDAR.2019.00066 – ident: ref24 doi: 10.1007/s11263-007-0090-8 – ident: ref38 doi: 10.1109/ICCV.2017.324 – ident: ref15 doi: 10.1109/ACCESS.2020.2975023 – ident: ref39 doi: 10.1109/3DV.2016.79 – start-page: 17721 year: 2020 ident: ref7 article-title: SOLOv2: Dynamic and fast instance segmentation publication-title: Proc Adv Neural Inf Process Syst contributor: fullname: wang – ident: ref1 doi: 10.1109/TPAMI.2016.2646371 – ident: ref26 doi: 10.1109/CVPR.2017.106 – ident: ref40 doi: 10.1109/CVPR.2019.00657 – ident: ref6 doi: 10.1109/ICCV.2017.322 – ident: ref29 doi: 10.1109/CVPR.2017.634 – ident: ref11 doi: 10.1109/ICDAR.2017.161 – ident: ref34 doi: 10.1109/34.31447 – year: 2021 ident: ref36 publication-title: FineGrainedLabelBuilder – year: 2021 ident: ref32 publication-title: COCO-Common Objects in Context – year: 2019 ident: ref28 article-title: High-resolution representations for labeling pixels and regions publication-title: arXiv 1904 04514 contributor: fullname: sun – ident: ref3 doi: 10.1109/CVPR.2015.7298965 – ident: ref25 doi: 10.1007/978-3-030-58523-5_38 – ident: ref23 doi: 10.1109/DAS.2016.29 – year: 2021 ident: ref35 publication-title: Labelme2COCO Python Package for Linux/MacOS/Windows – ident: ref12 doi: 10.1109/ICDAR.2017.94 – ident: ref20 doi: 10.1109/ICDAR.2019.00164 – ident: ref33 doi: 10.1016/0734-189X(85)90016-7 – volume: 28 start-page: 2017 year: 2015 ident: ref30 article-title: Spatial transformer networks publication-title: Proc Adv Neural Inf Process Syst contributor: fullname: jaderberg – ident: ref22 doi: 10.1109/ICFHR.2016.0093 |
SSID | ssj0000816957 |
Score | 2.2908657 |
Snippet | Accurate layout analysis without subsequent text-line segmentation remains an ongoing challenge, especially when facing the Kangyur, a kind of historical... |
SourceID | doaj proquest crossref ieee |
SourceType | Open Website Aggregation Database Publisher |
StartPage | 154435 |
SubjectTerms | Annotations Character recognition Datasets Document analysis and recognition fine-grained layout analysis historical Tibetan document images Image annotation Image segmentation Instance segmentation Layout layout analysis Layouts Object recognition Semantics Text analysis text line segmentation Text recognition |
SummonAdditionalLinks | – databaseName: IEEE Electronic Library Online dbid: RIE link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwELbanuAAhYJYaJEPHJtt4lfi43bLUhDlQiv1ZvkxRqiQRZAc4Nd37HiX54FbZDmJnc_2zOd4viHkBTpt0bIoq8YxqATIUDnrQ4UkTououtCyFI188U6dX4k31_J6hxxvY2EAIB8-g3m6zP_yw9qPaavsRGfzK3fJbqu7KVZru5-SEkho2RZhoabWJ4vlEvuAFJA1yEwZ2iX1m_HJGv0lqcpfK3E2L6v75GLTsOlUyc18HNzc__hDs_F_W75P7hU_ky6mgfGA7ED_kNz9RX3wgHxaeD8mpQi6wsLqVUoWAYG-td_X40A3aiUUvVqKXiL9KShCLz86QKeSnpUX01O0hYGu-1zxdfY4PdD38OFzCW3qH5Gr1cvL5XlVki9UXtTdUPkAIXilLFfAZMSpnw4hRiu0Ep1yEnmabXlaMSJC6rCC7BqEXSEB8rb2_DHZ69c9PCHUdUhVkYZzrlqhrbLQRQexZo4zB-Bm5HiDivkyaWyYzE1qbSYQTQLRFBBn5DQht62aBLJzAX5xU-abcTVvuxADbxkXurZaeylsY72EOqABmJGDhNL2IQWgGTncjANTJvM3M6muCaR-T_991zNyJzVw2pk5JHvD1xGO0FcZ3PM8SG8Bwg7nmA priority: 102 providerName: IEEE |
Title | Accurate Fine-Grained Layout Analysis for the Historical Tibetan Document Based on the Instance Segmentation |
URI | https://ieeexplore.ieee.org/document/9615155 https://www.proquest.com/docview/2601644626 https://doaj.org/article/b0378dfd3723490a99c54a1ac5e0d762 |
Volume | 9 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV25TuwwFLUQFRTosYl5LHJBSYTjLXE5jBgWAQ0g0Vle0ZMgg2Cm4O-5djww6BU0tJaVxPfEvudY9rkIHQJpi4ZGUdWWhooH4StrnK9AxCkeZesbmm4jX9_I83t--SAeFkp9pTNhvT1wH7hjS1jT-uhZQxlXxCjlBDe1cSIQ35TVl_AFMZXX4LaWSjTFZqgm6ng4GsGIQBDSGnQqhSwlv6Wi7NhfSqz8ty7nZDP-g9YKS8TD_uvW0VLoNtDqgnfgJnoaOjdLPg94DI3VWSr1EDy-Mu-T2RTPvUYwcFIMHA9_2YHgu382ACXEkF9maW8Qn0Am83jS5Y4XmS-6gG_D43O5mNRtofvx6d3ovCqlEyrHSTutnA_eOykNk4GKCBM3HSGMhivJW2kFqCzTsDTfIwBioYNoawBNgnxxhji2jZa7SRd2ELYtCE0Q0YzJhisjTWijDZFQy6gNwQ7Q0TyK-qV3yNBZWRCl-6DrFHRdgj5AJynSn12TvXVuANB1AV3_BPoAbSacPh-iMi8TA7Q3x02Xqfime880DsLt72-8ehetpOH0uzB7aHn6Ogv7wEum9iD_ggf5CuEHbNPeyA |
link.rule.ids | 315,782,786,798,866,2104,4026,27930,27931,27932,54765 |
linkProvider | Directory of Open Access Journals |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lc9MwEN4p5QAceBWmgQI6cKxTPyTZOqaBkELSC-lMbxo91h0G6jDgHODXs5KV8Dxw82hkW_InafeTtd8CvCSnrTVlK7LClphxFD6zxvmMSJzirWx8XYZo5OW5nF_wt5ficg-Od7EwiBgPn-E4XMZ_-X7tNmGr7ERF8ytuwE3BaymGaK3djkpIIaFEnaSFilydTKZT6gWRwLIgblqSZZK_mZ-o0p_Sqvy1FkcDM7sHy23ThnMlH8eb3o7d9z9UG_-37ffhbvI02WQYGg9gD7uHcOcX_cED-DRxbhO0ItiMCrM3IV0EerYw39abnm31Shj5tYz8RPZTUoStPlgkt5K9Si9mp2QNPVt3seJZ9Dkdsvd4dZ2Cm7pHcDF7vZrOs5R-IXM8b_rMefTeSWkqiaVoafKHY4it4UryRlpBTM3UVVgzWgLVUgXRFAS8JArkTO6qx7DfrTs8BGYbIqtExKtK1lwZabBpLbZ5aavSItoRHG9R0Z8HlQ0d2Umu9ACiDiDqBOIITgNyu6pBIjsW0BfXacZpm1d141tf1WXFVW6UcoKbwjiBuScTMIKDgNLuIQmgERxtx4FO0_mrHnTXOJG_J_--6wXcmq-WC704O3_3FG6Hxg77NEew33_Z4DPyXHr7PA7YHwy26uc |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Accurate+Fine-Grained+Layout+Analysis+for+the+Historical+Tibetan+Document+Based+on+the+Instance+Segmentation&rft.jtitle=IEEE+access&rft.au=Zhao%2C+Penghai&rft.au=Wang%2C+Weilan&rft.au=Cai%2C+Zhengqi&rft.au=Zhang%2C+Guowei&rft.date=2021&rft.issn=2169-3536&rft.eissn=2169-3536&rft.volume=9&rft.spage=154435&rft.epage=154447&rft_id=info:doi/10.1109%2FACCESS.2021.3128536&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_ACCESS_2021_3128536 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon |