A Disk-based Archival Storage System Using the EOS Erasure Coding Implementation for the ALICE Experiment at the CERN LHC

Korea Institute of Science and Technology Information (KISTI) is a Worldwide LHC Computing Grid (WLCG) Tier-1 center mandated to preserve raw data produced from A Large Ion Collider Experiment (ALICE) experiment using the world's largest particle accelerator, the Large Hadron Collider (LHC) at...

Full description

Saved in:
Bibliographic Details
Published inJournal of information science theory and practice Vol. 10; no. special; pp. 56 - 65
Main Authors Ahn, Sang Un, Bonfillou, Eric, Kim, Jeongheon, Panzer-Steindel, Bernd, Yoon, Heejun, Betev, Latchezar, Han, Heejune, Lee, Seung Hee, Peters, Andreas-Joachim
Format Journal Article
LanguageEnglish
Published Daejeon Korea Institute of Science and Technology Information 2022
한국과학기술정보연구원
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Korea Institute of Science and Technology Information (KISTI) is a Worldwide LHC Computing Grid (WLCG) Tier-1 center mandated to preserve raw data produced from A Large Ion Collider Experiment (ALICE) experiment using the world's largest particle accelerator, the Large Hadron Collider (LHC) at European Organization for Nuclear Research (CERN). Physical medium used widely for long-term data preservation is tape, thanks to its reliability and least price per capacity compared to other media such as optical disk, hard disk, and solid-state disk. However, decreasing numbers of manufacturers for both tape drives and cartridges, and patent disputes among them escalated risk of market. As alternative to tape-based data preservation strategy, we proposed disk-only erasure-coded archival storage system, Custodial Disk Storage (CDS), powered by Exascale Open Storage (EOS), an open-source storage management software developed by CERN. CDS system consists of 18 high density Just-Bunch-Of-Disks (JBOD) enclosures attached to 9 servers through 12 Gbps Serial Attached SCSI (SAS) Host Bus Adapter (HBA) interfaces via multiple paths for redundancy and multiplexing. For data protection, we introduced Reed-Solomon (RS) (16, 4) Erasure Coding (EC) layout, where the number of data and parity blocks are 12 and 4 respectively, which gives the annual data loss probability equivalent to 5×10-14. In this paper, we discuss CDS system design based on JBOD products, performance limitations, and data protection strategy accommodating EOS EC implementation. We present CDS operations for ALICE experiment and long-term power consumption measurement.
Bibliography:http://data.doi.or.kr/10.1633/JISTaP.2022.10.S.6
ISSN:2287-9099
2287-4577
DOI:10.1633/JISTaP.2022.10.S.6