Semantic Segmentation of PHT Based on Improved DeeplabV3

This work aimed to address the two shortcomings of the printed and handwritten texts (PHT) classification. The classification accuracy of FCN and U-net, which are used for PHT pixel-level classification, still has room to improve. PHT public datasets have small sample sizes, and the generalization a...

Full description

Saved in:

Bibliographic Details
Published in	Mathematical problems in engineering Vol. 2022; pp. 1 - 8
Main Author	Fang, Haiquan
Format	Journal Article
Language	English
Published	New York Hindawi 19.03.2022 Hindawi Limited
Subjects	Accuracy Algorithms Artificial intelligence Classification Datasets Deep learning Digitization Discriminant analysis Handwriting Labeling Mathematical problems Pixels Semantic segmentation Semantics Teaching methods Texts Wavelet transforms
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This work aimed to address the two shortcomings of the printed and handwritten texts (PHT) classification. The classification accuracy of FCN and U-net, which are used for PHT pixel-level classification, still has room to improve. PHT public datasets have small sample sizes, and the generalization ability of the models is not good. In this paper, first, a pixel-level sample-making method for PHT identification was proposed, and a PHT dataset 2021 (PHTD 2021), containing 3,000 samples, was constructed. Second, because there is a large number of words but the contours are small in documents, the DeeplabV3+ model was improved. The network layer number and pooling times were reduced, and the convolution kernel and dilated rate were increased. In the experiment, the improved DeeplabV3+ model had a classification accuracy of 95.06% on the test samples from the PHTD 2021 dataset. The improved DeeplabV3+ model has a higher recognition accuracy than the FCN and DeeplabV3+ models. Finally, after the classification of PHT, applications of handwritten texts removal and handwritten texts extraction are provided.
ISSN:	1024-123X 1563-5147
DOI:	10.1155/2022/6228532