Real-Time or Near Real-Time Persisting Daily Healthcare Data Into HDFS and ElasticSearch Index Inside a Big Data Platform

Mayo Clinic (MC) healthcare generates a large number of HL7 V2 messages-0.7-1.1 million on weekends and 1.7-2.2 million on business days at present. With multiple RDBMS-based systems, such a large volume of HL7 messages still cannot be real-time or near-real-time stored, analyzed, and retrieved for...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on industrial informatics Vol. 13; no. 2; pp. 595 - 606
Main Authors Dequan Chen, Yi Chen, Brownlow, Brian N., Kanjamala, Pradip P., Garcia Arredondo, Carlos A., Radspinner, Bryan L., Raveling, Matthew A.
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 01.04.2017
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Mayo Clinic (MC) healthcare generates a large number of HL7 V2 messages-0.7-1.1 million on weekends and 1.7-2.2 million on business days at present. With multiple RDBMS-based systems, such a large volume of HL7 messages still cannot be real-time or near-real-time stored, analyzed, and retrieved for enterprise-level clinic and nonclinic usage. To determine if Big Data technology coupled with ElasticSearch technology can satisfy MC daily healthcare needs for HL7 message processing, a BigData platform was developed to contain two identical Hadoop clusters (TDH1.3.2 version)-each containing an ElasticSearch cluster and instances of a storm topology-MayoTopology for processing HL7 messages on MC ESB queues into an ElasticSearch index and the HDFS. The implemented BigData platform can process 62 ± 4 million HL7 messages per day while the ElasticSearch index can provide ultrafast free-text searching at a speed level of 0.2-s per query on an index containing a dataset of 25 million HL7-derived-JSON-documents. The results suggest that the implemented BigData platform exceeds MC enterprise-level patient-care needs.
ISSN:1551-3203
1941-0050
DOI:10.1109/TII.2016.2645606