CIMI: Classify and Itemize Medical Image System for PFT Big Data Based on Deep Learning

The value of pulmonary function test (PFT) data is increasing due to the advent of the Coronavirus Infectious Disease 19 (COVID-19) and increased respiratory disease. However, these PFT data cannot be directly used in clinical studies, because PFT results are stored in raw image files. In this study...

Full description

Saved in:
Bibliographic Details
Published inApplied sciences Vol. 10; no. 23; p. 8575
Main Authors Kim, Tong Min, Lee, Seo-Joon, Lee, Hwa Young, Chang, Dong-Jin, Yoon, Chang Ii, Choi, In-Young, Yoon, Kun-Ho
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.12.2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The value of pulmonary function test (PFT) data is increasing due to the advent of the Coronavirus Infectious Disease 19 (COVID-19) and increased respiratory disease. However, these PFT data cannot be directly used in clinical studies, because PFT results are stored in raw image files. In this study, the classification and itemization medical image (CIMI) system generates valuable data from raw PFT images by automatically classifying various PFT results, extracting texts, and storing them in the PFT database and Excel files. The deep-learning-based optical character recognition (OCR) technology was mainly used in CIMI to classify and itemize PFT images in St. Mary’s Hospital. CIMI classified seven types and itemized 913,059 texts from 14,720 PFT image sheets, which cannot be done by humans. The number, type, and location of texts that can be extracted by PFT type are all different, but CIMI solves this issue by classifying the PFT image sheets by type, allowing researchers to analyze the data. To demonstrate the superiority of CIMI, the validation results of CIMI were compared to the results of the other four algorithms. A total of 70 randomly selected sheets (ten sheets from each type) and 33,550 texts were used for the validation. The accuracy of CIMI was 95%, which was the highest accuracy among the other four algorithms.
ISSN:2076-3417
2076-3417
DOI:10.3390/app10238575