Enhanced ICD-10 code assignment of clinical texts: A summarization-based approach
Assigning International Classification of Diseases (ICD) codes to clinical texts is a common and crucial practice in patient classification, hospital management, and further statistics analysis. Current auto-coding methods mainly transfer this task to a multi-label classification problem. Such solut...
Saved in:
Published in | Artificial intelligence in medicine Vol. 156; p. 102967 |
---|---|
Main Authors | , , , , , , , |
Format | Journal Article |
Language | English |
Published |
Netherlands
Elsevier B.V
01.10.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Assigning International Classification of Diseases (ICD) codes to clinical texts is a common and crucial practice in patient classification, hospital management, and further statistics analysis. Current auto-coding methods mainly transfer this task to a multi-label classification problem. Such solutions are suffering from high-dimensional mapping space and excessive redundant information in long clinical texts. To alleviate such a situation, we introduce text summarization methods to the ICD coding regime and apply text matching to select ICD codes.
We focus on the tenth revision of the ICD (ICD-10) coding and design a novel summarization-based approach (SuM) with an end-to-end strategy to efficiently assign ICD-10 code to clinical texts. In this approach, a knowledge-guided pointer network is purposed to distill and summarize key information in clinical texts precisely. Then a matching model with matching-aggregation architecture follows to align the summary result with code, tuning the one-vs-all scenario to one-vs-one matching so that the large-label-space obstacle laid in classification approaches would be avoided.
The 12,788 ICD-10 coded discharge summaries from a Chinese hospital were collected to evaluate the proposed approach. Compared with existing methods, the purposed model achieves the greatest coding results with Micro AUC of 0.9548, MRR@10 of 0.7977, Precision@10 of 0.0944, and Recall@10 of 0.9439 for the TOP-50 Dataset. Results on the FULL-Dataset remain consistent. Also, the proposed knowledge encoder and applied end-to-end strategy are proven to facilitate the whole model to gain efficacy in selecting the most suitable code.
The proposed automatic ICD-10 code assignment approach via text summarization can effectively capture critical messages in long clinical texts and improve the performance of ICD-10 coding of clinical texts.
•The text summarization method is used for ICD-10 coding with better distillation of key information in long clinical texts.•With the guidance of the main diagnosis, the summary of the long clinical text is precise and applicable for code matching.•The text matching method facilitates ICD-10 coding with interactive features and the one-vs-one operation.•A novel end-to-end model of the summarization module and text matching module is proposed. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0933-3657 1873-2860 1873-2860 |
DOI: | 10.1016/j.artmed.2024.102967 |