Contextual transformer sequence-based recognition network for medical examination reports

The automatic recognition of the medical examination report table (MERT) is receiving increasing attention in recent years as it is an essential step for intelligent healthcare and medical treatment. However, there are still some challenges in the table prediction when it is applied practically. In...

Full description

Saved in:

Bibliographic Details
Published in	Applied intelligence (Dordrecht, Netherlands) Vol. 53; no. 14; pp. 17363 - 17380
Main Authors	Wan, Honglin, Zhong, Zongfeng, Li, Tianping, Zhang, Huaxiang, Sun, Jiande
Format	Journal Article
Language	English
Published	New York Springer US 01.07.2023 Springer Nature B.V
Subjects	Accuracy Artificial Intelligence Coders Computer Science Datasets Feature recognition Health services Image reconstruction Machines Manufacturing Mechanical Engineering Medical imaging Methods Physical examinations Processes Semantics Transformers Table recognition Medical examination report Contextual transformer Sequence recognition
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The automatic recognition of the medical examination report table (MERT) is receiving increasing attention in recent years as it is an essential step for intelligent healthcare and medical treatment. However, there are still some challenges in the table prediction when it is applied practically. In this paper, a recognition network (CoT_SRN) for medical examination reports is proposed to improve the recognition accuracy of MERT structure and reconstruct the image into a spreadsheet. The network is based on contextual transformer sequence and consists of CoT encoder and SRN decoder. In the encoder, the CNN backbone is constructed to extract the MERT image structure features based on the Contextual Transformer (CoT) proposed in this paper. In the decoder, an attention head with gated recurrent unit (GRU) was used for feature sequence recognition to obtain the cell location and table structure represented by a structured language. In addition, MERT structure labels are defined as character-level HTML formats, which are added in the training of the table structure recognition. The proposed method can achieve competitive tree-edit-distance-based similarity (TEDS) scores on the English datasets, such as PubTabNet and SciTSR, and Chinese datasets, such as the Chinese medical document dataset (CMDD). It demonstrates that the Cot_SRN is helpful to preserve the good performance across multi-language MERT structure recognition. Additionally, the performance of the proposed method is verified on the practical examples with folds and small angle deflection. The experimental results show that the proposed method is promising in practical application.
ISSN:	0924-669X 1573-7497
DOI:	10.1007/s10489-022-04420-4