On the Issue of Incomplete and Missing Water-Quality Data in Mine Site Databases: Comparing Three Imputation Methods

Large water-quality databases are valuable for predicting mine drainage chemistry, identifying optimal measures for mitigation and remediation, and refuting/refining models and theories. However, such databases often have missing values due to periodic lack of sampling and analysis or input errors....

Full description

Saved in:

Bibliographic Details
Published in	Mine water and the environment Vol. 35; no. 1; pp. 3 - 9
Main Authors	Betrie, Getnet D., Sadiq, Rehan, Tesfamariam, Solomon, Morin, Kevin A.
Format	Journal Article
Language	English
Published	Berlin/Heidelberg Springer Berlin Heidelberg 01.03.2016 Springer Nature B.V
Subjects	Artificial intelligence Drainage Earth and Environmental Science Earth Sciences Ecotoxicology Geology Gold Hydrogeology Industrial Pollution Prevention Mine drainage Mineral Resources Mines Molybdenum Rhenium Silver Statistical analysis Technical Article Water analysis Water quality Water Quality/Water Pollution AMELIA Data-driven Missing values Machine learning IMPSEQ IRMI
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Large water-quality databases are valuable for predicting mine drainage chemistry, identifying optimal measures for mitigation and remediation, and refuting/refining models and theories. However, such databases often have missing values due to periodic lack of sampling and analysis or input errors. These missing values lead to problems in machine learning and statistical analysis of water-quality data from mine sites. Using water-quality data collected from 1971 to 1994 from many locations at a copper-molybdenum-gold-silver-rhenium mine site, we compared three imputation methods to estimate missing water-quality data: iterative robust model-based imputation (IRMI), multiple imputations of incomplete multivariate data (AMELIA), and sequential imputation for missing values (IMPSEQ). These methods were evaluated based on mean absolute error, relative absolute error, and percent bias techniques. The results showed that IMPSEQ and IRMI are suitable to impute missing values in water-quality databases at mine sites, whereas AMELIA is not.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1025-9112 1616-1068
DOI:	10.1007/s10230-014-0322-4