TDEM: Table Data Extraction Model Based on Cell Segmentation

To accurately extract tabular data, we propose a novel cell-based tabular data extraction model (TDEM). The key of TDEM is to utilize grayscale projection of row separation lines, coupled with table masks and column masks generated by the VGG-19 neural network, to segment each individual cell from t...

Full description

Saved in:
Bibliographic Details
Published inIEICE Transactions on Information and Systems Vol. E107.D; no. 10; pp. 1376 - 1379
Main Authors WANG, Zhe, LU, Zhe-Ming, LUO, Hao, ZHENG, Yang-Ming
Format Journal Article
LanguageEnglish
Published The Institute of Electronics, Information and Communication Engineers 01.10.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:To accurately extract tabular data, we propose a novel cell-based tabular data extraction model (TDEM). The key of TDEM is to utilize grayscale projection of row separation lines, coupled with table masks and column masks generated by the VGG-19 neural network, to segment each individual cell from the input image of the table. In this way, the text content of the table is extracted from a specific single cell, which greatly improves the accuracy of table recognition.
ISSN:0916-8532
1745-1361
DOI:10.1587/transinf.2024EDL8029