Real-Time or Near Real-Time Persisting Daily Healthcare Data Into HDFS and ElasticSearch Index Inside a Big Data Platform
Mayo Clinic (MC) healthcare generates a large number of HL7 V2 messages-0.7-1.1 million on weekends and 1.7-2.2 million on business days at present. With multiple RDBMS-based systems, such a large volume of HL7 messages still cannot be real-time or near-real-time stored, analyzed, and retrieved for...
Saved in:
Published in | IEEE transactions on industrial informatics Vol. 13; no. 2; pp. 595 - 606 |
---|---|
Main Authors | , , , , , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
01.04.2017
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Mayo Clinic (MC) healthcare generates a large number of HL7 V2 messages-0.7-1.1 million on weekends and 1.7-2.2 million on business days at present. With multiple RDBMS-based systems, such a large volume of HL7 messages still cannot be real-time or near-real-time stored, analyzed, and retrieved for enterprise-level clinic and nonclinic usage. To determine if Big Data technology coupled with ElasticSearch technology can satisfy MC daily healthcare needs for HL7 message processing, a BigData platform was developed to contain two identical Hadoop clusters (TDH1.3.2 version)-each containing an ElasticSearch cluster and instances of a storm topology-MayoTopology for processing HL7 messages on MC ESB queues into an ElasticSearch index and the HDFS. The implemented BigData platform can process 62 ± 4 million HL7 messages per day while the ElasticSearch index can provide ultrafast free-text searching at a speed level of 0.2-s per query on an index containing a dataset of 25 million HL7-derived-JSON-documents. The results suggest that the implemented BigData platform exceeds MC enterprise-level patient-care needs. |
---|---|
ISSN: | 1551-3203 1941-0050 |
DOI: | 10.1109/TII.2016.2645606 |