A Consideration of a Methodology of the Object-oriented Term Weighting Using Hierarchical Term Classification for Medical Document Analysis

In this paper, we consider a methodology of the object-oriented term weighting, by using a hierarchical structure of terms in medical documents according to analytical purposes. The hierarchical term classification exploits logical negation and medical information of ranking corresponding to ICD-10...

Full description

Saved in:
Bibliographic Details
Published inJapan Journal of Medical Informatics Vol. 38; no. 2; pp. 69 - 79
Main Authors Matsuo, R, TB, Ho, Ikeda, M, Tanaka, K, Chen, W
Format Journal Article
LanguageJapanese
Published Japan Association for Medical Informatics 15.06.2018
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this paper, we consider a methodology of the object-oriented term weighting, by using a hierarchical structure of terms in medical documents according to analytical purposes. The hierarchical term classification exploits logical negation and medical information of ranking corresponding to ICD-10 codes and consists of the category of terms as the nodes. It is employed to generate weighting rules for the object-oriented term weighting and we capture the order relation among the categories by giving the weights based on analytical purposes to the categories. Specifically, we generate three weighting rules from two features of the hierarchical term classification: the term hierarchy and the exploitation of medical information of ranking, and give higher weight to terms where it is located at the deep layer, non-negative terms’ categories and the higher rank in the hierarchy as important terms for a certain analytical purpose. The experimental results on mortality prediction which is one of the analytical purposes have indicated the effectiveness of the object-oriented term weighting. Therefore, it was suggested that the proposed methodology of the object-oriented term weighting is effective. Although, we regard the terms which correspond to ICD-10 as the dependent and important terms for analytical purposes, we considered the exploitation of machine learning techniques to capture the similar dependencies regarding analytical purposes for the terms which do not correspond to ICD-10. The proposed methodology and the order relation among terms’ categories derived from the weighting rules, a dictionary of terms and the weights have potential to contribute to the enhancement of knowledge acquisition support by big data analysis in medical domain.
ISSN:0289-8055
2188-8469
DOI:10.14948/jami.38.69