Information extraction of Chinese medical electronic records via evolutionary neural architecture search

To obtain valuable information from the Chinese electronic medical records (CEMR), information extraction (IE), including named entity recognition (NER) and relation extraction (RE), is often used to extract information from semi-structured text. Recently, machine learning (ML) and deep learning (DL...

Full description

Saved in:
Bibliographic Details
Published in2023 IEEE International Conference on Data Mining Workshops (ICDMW) pp. 396 - 405
Main Authors Zhang, Tian, Li, Nan, Zhou, Yuee, Cai, Wei, Ma, Lianbo
Format Conference Proceeding
LanguageEnglish
Published IEEE 04.12.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:To obtain valuable information from the Chinese electronic medical records (CEMR), information extraction (IE), including named entity recognition (NER) and relation extraction (RE), is often used to extract information from semi-structured text. Recently, machine learning (ML) and deep learning (DL) models for IE have been successfully applied to CEMR with positive results. However, the performance of ML and DL relies on extensive expert knowledge and manual parameter tuning. In this paper, we propose a new method based on an evolutionary algorithm to tackle the network structure and training parameters selection problem of BiLSTM for NER and RE. In our method, an efficient mixed-length genetic encoding strategy is designed to represent the BiLSTM network information and the model training parameters. Specifically, all learning rates, num of hidden layers and neurons, and optimizers can be designed automatically, and then an optimal set of selection results can be obtained. In addition, we design efficient genetic operators to adapt to hybrid parameters and propose a deep-learning strategy to reduce computational overhead. The experiments show that our method achieves 89.13% and 60.85% F1 scores on the CCKS2019 Task 1 and CHIP-2020 Task 2 datasets, respectively, which demonstrates the effectiveness of our method.
ISSN:2375-9259
DOI:10.1109/ICDMW60847.2023.00056