Implementation of a Dataset Interoperability Working Group (DIWG) Recommendation Compliance Test Tool for Earth Science Data Using pytest Framework

Improving data interoperability and adherence to standards is critical for Earth Science research and applications. The Earth Science Data Systems (ESDS) Dataset Interoperability Working Group (DIWG) has developed a set of recommendations for data producers to improve data interoperability. A compli...

Full description

Saved in:
Bibliographic Details
Published in2024 12th International Conference on Agro-Geoinformatics (Agro-Geoinformatics) pp. 1 - 6
Main Authors Yu, Eugene G., Hegde, Mahabaleshwara S., Di, Liping, Leonard, Peter J. T.
Format Conference Proceeding
LanguageEnglish
Published IEEE 15.07.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Improving data interoperability and adherence to standards is critical for Earth Science research and applications. The Earth Science Data Systems (ESDS) Dataset Interoperability Working Group (DIWG) has developed a set of recommendations for data producers to improve data interoperability. A compliance test tool is being developed to compare an Earth Science data file with the DIWG recommendations. This tool focuses on three main areas: checking the names of individual granules, checking metadata compliance (including variable-level metadata, granule-level metadata, and collection-level metadata), and checking data value compliance. The pytest framework, a widely used testing tool in the Python ecosystem, serves as the foundation for this compliance test tool. The compliance test tool is designed to be flexible and adaptable to various environments. It can be run in an isolated virtual environment, a container, or a virtual machine. This flexibility allows for easy integration into different workflows and systems. In addition to the core functionality, the tool also includes Web API services. These services are implemented with a predefined schema, allowing the test to be invoked in a Web environment or in a cloud computing environment. This feature enhances the accessibility and usability of the tool, making it suitable for a wide range of use cases. The tool provides comprehensive compliance reports in multiple formats, including pytest output, JSON, XML, HTML, markdown, or PDF. This feature caters to different use scenarios, making the tool versatile and user-friendly. The effectiveness of the compliance test tool is demonstrated through various case studies. These case studies make use of benchmark datasets from different sources. Some of these sources include samples from HDFGROUP and various Distributed Active Archive Centers (DAACs). A special emphasis is placed on data from the Goddard Earth Sciences Data and Information Services Center (GES DISC). In short, the study presents a comprehensive approach to ensuring data interoperability in Earth science through the implementation of a compliance test tool based on the DIWG recommendations. The tool's effectiveness is demonstrated through case studies using benchmark datasets, highlighting its potential for widespread use in the field. The tool's flexibility, adaptability, and comprehensive reporting capabilities make it a valuable resource for data providers aiming to adhere to the DIWG recommendations.
ISSN:2995-0643
DOI:10.1109/Agro-Geoinformatics262780.2024.10660949