Fault Detection and Diagnosis Software of LHAASO
The Large High Altitude Air Shower Observatory (LHAASO) is a mega-scale dual-task facility designed to study cosmic rays and <inline-formula> <tex-math notation="LaTeX">\gamma </tex-math></inline-formula>-rays. Online computing system of LHAASO supports its online o...
Saved in:
Published in | IEEE transactions on nuclear science Vol. 72; no. 3; pp. 301 - 308 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.03.2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The Large High Altitude Air Shower Observatory (LHAASO) is a mega-scale dual-task facility designed to study cosmic rays and <inline-formula> <tex-math notation="LaTeX">\gamma </tex-math></inline-formula>-rays. Online computing system of LHAASO supports its online operation and computation. Physical phenomena such as cosmic rays occur unpredictably and therefore require the online computing system to run uninterruptedly. LHAASO is large and the environment is harsh, so the online computing system is subject to failure. Once a system fails, maintenance personnel are required to quickly analyze the cause of the failure and repair it. The fault detection and diagnosis software (FADD) is designed to quickly detect and analyze system faults. The software implements comprehensive monitoring of each component of LHAASO's online computing system (computing nodes, switches, and data flow software) and collects real-time status information. When a fault occurs, FADD can quickly analyze the cause of the fault and provide alarm information to the on-call staff as soon as possible. In addition, it can also analyze historical data within a specified period and generate data reports as needed. FADD is designed to fully consider the characteristics of large-scale high-energy physics experiments and satisfy the requirements of high throughput and high efficiency by using a distributed architecture. The software consists of the following layers: information collection layer, data analysis layer, and result layer, and contains metrics detection software, fault monitoring module, fault diagnosis module, and other functional modules. FADD has been applied to LHAASO and can diagnose operational faults quickly and accurately, helping to reduce the burden on maintenance personnel. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 0018-9499 1558-1578 |
DOI: | 10.1109/TNS.2024.3454806 |