Accurate Fine-Grained Layout Analysis for the Historical Tibetan Document Based on the Instance Segmentation

Accurate layout analysis without subsequent text-line segmentation remains an ongoing challenge, especially when facing the Kangyur, a kind of historical Tibetan document featuring considerable touching components and mottled background. Aiming at identifying different regions in document images, la...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 9; pp. 154435 - 154447
Main Authors Zhao, Penghai, Wang, Weilan, Cai, Zhengqi, Zhang, Guowei, Lu, Yuqi
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Accurate layout analysis without subsequent text-line segmentation remains an ongoing challenge, especially when facing the Kangyur, a kind of historical Tibetan document featuring considerable touching components and mottled background. Aiming at identifying different regions in document images, layout analysis is indispensable for subsequent procedures such as character recognition. However, there was only a little research being carried out to perform line-level layout analysis which failed to deal with the Kangyur. To obtain the optimal results, a fine-grained sub-line level layout analysis approach is presented. Firstly, we introduced an accelerated method to build the dataset which is dynamic and reliable. Secondly, enhancement had been made to the SOLOv2 according to the characteristics of the Kangyur. Then, we fed the enhanced SOLOv2 with the prepared annotation file during the training phase. Once the network is trained, instances of the text line, sentence, and titles can be segmented and identified during the inference stage. The experimental results show that the proposed method delivers a decent 72.7% average precision on our dataset. In general, this preliminary research provides insights into the fine-grained sub-line level layout analysis and testifies the SOLOv2-based approaches. We also believe that the proposed methods can be adopted on other language documents with various layouts.
AbstractList Accurate layout analysis without subsequent text-line segmentation remains an ongoing challenge, especially when facing the Kangyur, a kind of historical Tibetan document featuring considerable touching components and mottled background. Aiming at identifying different regions in document images, layout analysis is indispensable for subsequent procedures such as character recognition. However, there was only a little research being carried out to perform line-level layout analysis which failed to deal with the Kangyur. To obtain the optimal results, a fine-grained sub-line level layout analysis approach is presented. Firstly, we introduced an accelerated method to build the dataset which is dynamic and reliable. Secondly, enhancement had been made to the SOLOv2 according to the characteristics of the Kangyur. Then, we fed the enhanced SOLOv2 with the prepared annotation file during the training phase. Once the network is trained, instances of the text line, sentence, and titles can be segmented and identified during the inference stage. The experimental results show that the proposed method delivers a decent 72.7% average precision on our dataset. In general, this preliminary research provides insights into the fine-grained sub-line level layout analysis and testifies the SOLOv2-based approaches. We also believe that the proposed methods can be adopted on other language documents with various layouts.
Author Cai, Zhengqi
Zhang, Guowei
Wang, Weilan
Zhao, Penghai
Lu, Yuqi
Author_xml – sequence: 1
  givenname: Penghai
  orcidid: 0000-0002-9613-5318
  surname: Zhao
  fullname: Zhao, Penghai
  organization: Key Laboratory of China's Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou, China
– sequence: 2
  givenname: Weilan
  orcidid: 0000-0003-2935-6601
  surname: Wang
  fullname: Wang, Weilan
  email: wangweilan@xbmu.edu.cn
  organization: Key Laboratory of China's Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou, China
– sequence: 3
  givenname: Zhengqi
  surname: Cai
  fullname: Cai, Zhengqi
  organization: School of Mathematics and Computer Science, Northwest Minzu University, Lanzhou, China
– sequence: 4
  givenname: Guowei
  orcidid: 0000-0002-6656-3960
  surname: Zhang
  fullname: Zhang, Guowei
  organization: Key Laboratory of China's Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou, China
– sequence: 5
  givenname: Yuqi
  surname: Lu
  fullname: Lu, Yuqi
  organization: Key Laboratory of China's Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou, China
BookMark eNpNUU1vGyEQRVUqNU3zC3JB6nkdPhYWjq6bD0uWenB6RmMWUqwNpMAe_O-Ls1HUuczozXtvpHlf0UVM0SF0Q8mKUqJv15vN3X6_YoTRFadMCS4_oUtGpe54my_-m7-g61KOpJVqkBgu0bS2ds5QHb4P0XUPGVob8Q5Oaa54HWE6lVCwTxnXPw4_hlJTDhYm_BQOrkLEP5OdX1ys-AeUpkzxjbiNpS2tw3v3fN5CDSl-Q589TMVdv_cr9Pv-7mnz2O1-PWw3611ne6JqZ0c3jlZK4NIx4TlVhOjeQ69lr-RBKK1h4JwMxHupDo0gFPXApBqEBWL5FdouvmOCo3nN4QXyySQI5g1I-dlArsFOzhwIH9ToRz4w3msCWlvRAwUrHBkHyZrX98XrNae_syvVHNOc21-KYZJQ2feSycbiC8vmVEp2_uMqJeackllSMueUzHtKTXWzqIJz7kOhJRVUCP4POZmO_Q
CODEN IAECCG
CitedBy_id crossref_primary_10_1186_s40494_023_01125_w
crossref_primary_10_1109_ACCESS_2022_3151886
crossref_primary_10_3390_electronics11233919
crossref_primary_10_1007_s00371_023_02850_w
crossref_primary_10_3233_JCM_226167
crossref_primary_10_3390_jimaging10030065
crossref_primary_10_11834_jig_240015
Cites_doi 10.1007/978-3-030-03338-5_7
10.1109/ICDAR.2017.50
10.1109/ICDAR.2019.00088
10.1109/ICDAR.2013.154
10.1007/978-3-030-01264-9_5
10.1109/ICDAR.2019.00166
10.1109/ICCV.2019.00925
10.1109/TPAMI.2017.2699184
10.1109/LSP.2014.2325940
10.1109/CVPR.2017.462
10.1016/j.ipm.2021.102689
10.1109/CVPR.2016.90
10.1109/ICDAR.2015.7333877
10.1109/ICMLA.2019.00223
10.1109/ICDAR.2019.00066
10.1007/s11263-007-0090-8
10.1109/ICCV.2017.324
10.1109/ACCESS.2020.2975023
10.1109/3DV.2016.79
10.1109/TPAMI.2016.2646371
10.1109/CVPR.2017.106
10.1109/CVPR.2019.00657
10.1109/ICCV.2017.322
10.1109/CVPR.2017.634
10.1109/ICDAR.2017.161
10.1109/34.31447
10.1109/CVPR.2015.7298965
10.1007/978-3-030-58523-5_38
10.1109/DAS.2016.29
10.1109/ICDAR.2017.94
10.1109/ICDAR.2019.00164
10.1016/0734-189X(85)90016-7
10.1109/ICFHR.2016.0093
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021
DBID 97E
ESBDL
RIA
RIE
AAYXX
CITATION
7SC
7SP
7SR
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
DOA
DOI 10.1109/ACCESS.2021.3128536
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005-present
IEEE Open Access Journals
IEEE All-Society Periodicals Package (ASPP) 1998-Present
IEEE Electronic Library Online
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Engineered Materials Abstracts
METADEX
Technology Research Database
Materials Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Directory of Open Access Journals
DatabaseTitle CrossRef
Materials Research Database
Engineered Materials Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
METADEX
Computer and Information Systems Abstracts Professional
DatabaseTitleList Materials Research Database


Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: RIE
  name: IEEE Electronic Library Online
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2169-3536
EndPage 154447
ExternalDocumentID oai_doaj_org_article_b0378dfd3723490a99c54a1ac5e0d762
10_1109_ACCESS_2021_3128536
9615155
Genre orig-research
GrantInformation_xml – fundername: Program for Leading Talent of State Ethnic Affairs Commission
– fundername: Program for Innovative Research Team of State Ethnic Affairs Commission (SEAC) ([2018]98)
– fundername: National Natural Science Foundation of China
  grantid: 61772430; 62166036
  funderid: 10.13039/501100001809
– fundername: Postgraduate Support Programs of Northwest Minzu University’s Fundamental Research Funds for the Central Universities
  grantid: Ymx2021002
  funderid: 10.13039/501100012226
– fundername: Natural Science Foundation of Gansu Province of China
  grantid: 21JR1RA195
  funderid: 10.13039/501100004775
– fundername: Gansu Provincial First-Class Discipline Program of Northwest Minzu University
  grantid: 11080305
  funderid: 10.13039/501100017619
GroupedDBID 0R~
4.4
5VS
6IK
97E
AAJGR
ABVLG
ACGFS
ADBBV
ALMA_UNASSIGNED_HOLDINGS
BCNDV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
ESBDL
GROUPED_DOAJ
IFIPE
IPLJI
JAVBF
KQ8
M43
M~E
O9-
OCL
OK1
RIA
RIE
RIG
RNS
AAYXX
CITATION
7SC
7SP
7SR
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c408t-cdeddc66a36e25f3180094fa496486b5899a733070ff68bf31581fa26875ca0c3
IEDL.DBID DOA
ISSN 2169-3536
IngestDate Tue Oct 22 15:16:48 EDT 2024
Thu Oct 10 17:18:58 EDT 2024
Fri Aug 23 00:57:38 EDT 2024
Wed Jun 26 19:25:25 EDT 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c408t-cdeddc66a36e25f3180094fa496486b5899a733070ff68bf31581fa26875ca0c3
ORCID 0000-0002-6656-3960
0000-0002-9613-5318
0000-0003-2935-6601
OpenAccessLink https://doaj.org/article/b0378dfd3723490a99c54a1ac5e0d762
PQID 2601644626
PQPubID 4845423
PageCount 13
ParticipantIDs proquest_journals_2601644626
crossref_primary_10_1109_ACCESS_2021_3128536
ieee_primary_9615155
doaj_primary_oai_doaj_org_article_b0378dfd3723490a99c54a1ac5e0d762
PublicationCentury 2000
PublicationDate 20210000
2021-00-00
20210101
2021-01-01
PublicationDateYYYYMMDD 2021-01-01
PublicationDate_xml – year: 2021
  text: 20210000
PublicationDecade 2020
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE access
PublicationTitleAbbrev Access
PublicationYear 2021
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref34
ref12
jia (ref31) 2016; 29
ref15
ref14
ref33
ref11
ref10
ref2
ref1
ref39
ref17
ref38
ref16
ref19
(ref35) 2021
ref18
ref24
ref23
ref26
ref25
ref20
(ref32) 2021
ref41
ref22
ref21
ref27
ref29
ref8
chen (ref37) 2019
ref4
ref3
(ref36) 2021
ref6
jaderberg (ref30) 2015; 28
ref5
ryu (ref9) 2014; 21
wang (ref7) 2020
sun (ref28) 2019
ref40
References_xml – ident: ref16
  doi: 10.1007/978-3-030-03338-5_7
– ident: ref14
  doi: 10.1109/ICDAR.2017.50
– year: 2019
  ident: ref37
  article-title: MMDetection: Open MMLab detection toolbox and benchmark
  publication-title: ArXiv 1906 07155
  contributor:
    fullname: chen
– ident: ref18
  doi: 10.1109/ICDAR.2019.00088
– ident: ref8
  doi: 10.1109/ICDAR.2013.154
– volume: 29
  start-page: 667
  year: 2016
  ident: ref31
  article-title: Dynamic filter networks
  publication-title: Proc Adv Neural Inf Process Syst
  contributor:
    fullname: jia
– ident: ref2
  doi: 10.1007/978-3-030-01264-9_5
– ident: ref21
  doi: 10.1109/ICDAR.2019.00166
– ident: ref41
  doi: 10.1109/ICCV.2019.00925
– ident: ref5
  doi: 10.1109/TPAMI.2017.2699184
– volume: 21
  start-page: 1115
  year: 2014
  ident: ref9
  article-title: Language-independent text-line extraction algorithm for handwritten documents
  publication-title: IEEE Signal Process Lett
  doi: 10.1109/LSP.2014.2325940
  contributor:
    fullname: ryu
– ident: ref13
  doi: 10.1109/CVPR.2017.462
– ident: ref17
  doi: 10.1016/j.ipm.2021.102689
– ident: ref27
  doi: 10.1109/CVPR.2016.90
– ident: ref10
  doi: 10.1109/ICDAR.2015.7333877
– ident: ref19
  doi: 10.1109/ICMLA.2019.00223
– ident: ref4
  doi: 10.1109/ICDAR.2019.00066
– ident: ref24
  doi: 10.1007/s11263-007-0090-8
– ident: ref38
  doi: 10.1109/ICCV.2017.324
– ident: ref15
  doi: 10.1109/ACCESS.2020.2975023
– ident: ref39
  doi: 10.1109/3DV.2016.79
– start-page: 17721
  year: 2020
  ident: ref7
  article-title: SOLOv2: Dynamic and fast instance segmentation
  publication-title: Proc Adv Neural Inf Process Syst
  contributor:
    fullname: wang
– ident: ref1
  doi: 10.1109/TPAMI.2016.2646371
– ident: ref26
  doi: 10.1109/CVPR.2017.106
– ident: ref40
  doi: 10.1109/CVPR.2019.00657
– ident: ref6
  doi: 10.1109/ICCV.2017.322
– ident: ref29
  doi: 10.1109/CVPR.2017.634
– ident: ref11
  doi: 10.1109/ICDAR.2017.161
– ident: ref34
  doi: 10.1109/34.31447
– year: 2021
  ident: ref36
  publication-title: FineGrainedLabelBuilder
– year: 2021
  ident: ref32
  publication-title: COCO-Common Objects in Context
– year: 2019
  ident: ref28
  article-title: High-resolution representations for labeling pixels and regions
  publication-title: arXiv 1904 04514
  contributor:
    fullname: sun
– ident: ref3
  doi: 10.1109/CVPR.2015.7298965
– ident: ref25
  doi: 10.1007/978-3-030-58523-5_38
– ident: ref23
  doi: 10.1109/DAS.2016.29
– year: 2021
  ident: ref35
  publication-title: Labelme2COCO Python Package for Linux/MacOS/Windows
– ident: ref12
  doi: 10.1109/ICDAR.2017.94
– ident: ref20
  doi: 10.1109/ICDAR.2019.00164
– ident: ref33
  doi: 10.1016/0734-189X(85)90016-7
– volume: 28
  start-page: 2017
  year: 2015
  ident: ref30
  article-title: Spatial transformer networks
  publication-title: Proc Adv Neural Inf Process Syst
  contributor:
    fullname: jaderberg
– ident: ref22
  doi: 10.1109/ICFHR.2016.0093
SSID ssj0000816957
Score 2.2908657
Snippet Accurate layout analysis without subsequent text-line segmentation remains an ongoing challenge, especially when facing the Kangyur, a kind of historical...
SourceID doaj
proquest
crossref
ieee
SourceType Open Website
Aggregation Database
Publisher
StartPage 154435
SubjectTerms Annotations
Character recognition
Datasets
Document analysis and recognition
fine-grained layout analysis
historical Tibetan document images
Image annotation
Image segmentation
Instance segmentation
Layout
layout analysis
Layouts
Object recognition
Semantics
Text analysis
text line segmentation
Text recognition
SummonAdditionalLinks – databaseName: IEEE Electronic Library Online
  dbid: RIE
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwELbanuAAhYJYaJEPHJtt4lfi43bLUhDlQiv1ZvkxRqiQRZAc4Nd37HiX54FbZDmJnc_2zOd4viHkBTpt0bIoq8YxqATIUDnrQ4UkTououtCyFI188U6dX4k31_J6hxxvY2EAIB8-g3m6zP_yw9qPaavsRGfzK3fJbqu7KVZru5-SEkho2RZhoabWJ4vlEvuAFJA1yEwZ2iX1m_HJGv0lqcpfK3E2L6v75GLTsOlUyc18HNzc__hDs_F_W75P7hU_ky6mgfGA7ED_kNz9RX3wgHxaeD8mpQi6wsLqVUoWAYG-td_X40A3aiUUvVqKXiL9KShCLz86QKeSnpUX01O0hYGu-1zxdfY4PdD38OFzCW3qH5Gr1cvL5XlVki9UXtTdUPkAIXilLFfAZMSpnw4hRiu0Ep1yEnmabXlaMSJC6rCC7BqEXSEB8rb2_DHZ69c9PCHUdUhVkYZzrlqhrbLQRQexZo4zB-Bm5HiDivkyaWyYzE1qbSYQTQLRFBBn5DQht62aBLJzAX5xU-abcTVvuxADbxkXurZaeylsY72EOqABmJGDhNL2IQWgGTncjANTJvM3M6muCaR-T_991zNyJzVw2pk5JHvD1xGO0FcZ3PM8SG8Bwg7nmA
  priority: 102
  providerName: IEEE
Title Accurate Fine-Grained Layout Analysis for the Historical Tibetan Document Based on the Instance Segmentation
URI https://ieeexplore.ieee.org/document/9615155
https://www.proquest.com/docview/2601644626
https://doaj.org/article/b0378dfd3723490a99c54a1ac5e0d762
Volume 9
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV25TuwwFLUQFRTosYl5LHJBSYTjLXE5jBgWAQ0g0Vle0ZMgg2Cm4O-5djww6BU0tJaVxPfEvudY9rkIHQJpi4ZGUdWWhooH4StrnK9AxCkeZesbmm4jX9_I83t--SAeFkp9pTNhvT1wH7hjS1jT-uhZQxlXxCjlBDe1cSIQ35TVl_AFMZXX4LaWSjTFZqgm6ng4GsGIQBDSGnQqhSwlv6Wi7NhfSqz8ty7nZDP-g9YKS8TD_uvW0VLoNtDqgnfgJnoaOjdLPg94DI3VWSr1EDy-Mu-T2RTPvUYwcFIMHA9_2YHgu382ACXEkF9maW8Qn0Am83jS5Y4XmS-6gG_D43O5mNRtofvx6d3ovCqlEyrHSTutnA_eOykNk4GKCBM3HSGMhivJW2kFqCzTsDTfIwBioYNoawBNgnxxhji2jZa7SRd2ELYtCE0Q0YzJhisjTWijDZFQy6gNwQ7Q0TyK-qV3yNBZWRCl-6DrFHRdgj5AJynSn12TvXVuANB1AV3_BPoAbSacPh-iMi8TA7Q3x02Xqfime880DsLt72-8ehetpOH0uzB7aHn6Ogv7wEum9iD_ggf5CuEHbNPeyA
link.rule.ids 315,782,786,798,866,2104,4026,27930,27931,27932,54765
linkProvider Directory of Open Access Journals
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lc9MwEN4p5QAceBWmgQI6cKxTPyTZOqaBkELSC-lMbxo91h0G6jDgHODXs5KV8Dxw82hkW_InafeTtd8CvCSnrTVlK7LClphxFD6zxvmMSJzirWx8XYZo5OW5nF_wt5ficg-Od7EwiBgPn-E4XMZ_-X7tNmGr7ERF8ytuwE3BaymGaK3djkpIIaFEnaSFilydTKZT6gWRwLIgblqSZZK_mZ-o0p_Sqvy1FkcDM7sHy23ThnMlH8eb3o7d9z9UG_-37ffhbvI02WQYGg9gD7uHcOcX_cED-DRxbhO0ItiMCrM3IV0EerYw39abnm31Shj5tYz8RPZTUoStPlgkt5K9Si9mp2QNPVt3seJZ9Dkdsvd4dZ2Cm7pHcDF7vZrOs5R-IXM8b_rMefTeSWkqiaVoafKHY4it4UryRlpBTM3UVVgzWgLVUgXRFAS8JArkTO6qx7DfrTs8BGYbIqtExKtK1lwZabBpLbZ5aavSItoRHG9R0Z8HlQ0d2Umu9ACiDiDqBOIITgNyu6pBIjsW0BfXacZpm1d141tf1WXFVW6UcoKbwjiBuScTMIKDgNLuIQmgERxtx4FO0_mrHnTXOJG_J_--6wXcmq-WC704O3_3FG6Hxg77NEew33_Z4DPyXHr7PA7YHwy26uc
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Accurate+Fine-Grained+Layout+Analysis+for+the+Historical+Tibetan+Document+Based+on+the+Instance+Segmentation&rft.jtitle=IEEE+access&rft.au=Zhao%2C+Penghai&rft.au=Wang%2C+Weilan&rft.au=Cai%2C+Zhengqi&rft.au=Zhang%2C+Guowei&rft.date=2021&rft.issn=2169-3536&rft.eissn=2169-3536&rft.volume=9&rft.spage=154435&rft.epage=154447&rft_id=info:doi/10.1109%2FACCESS.2021.3128536&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_ACCESS_2021_3128536
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon