Research of error data detection algorithm based on rules

Data entry errors, improper integration, data environment changes, etc., will affect the quality of the data. Among them, the error data is the most serious data quality problems. To clean up the error data, to play the role of information systems and improve the quality of the data, the detection m...

Full description

Saved in:
Bibliographic Details
Published in2011 IEEE 3rd International Conference on Communication Software and Networks pp. 159 - 163
Main Authors Zhang, Zhong-Bin, Zhou, Yu-Hua, Liu, Yong-Zhi
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.05.2011
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Data entry errors, improper integration, data environment changes, etc., will affect the quality of the data. Among them, the error data is the most serious data quality problems. To clean up the error data, to play the role of information systems and improve the quality of the data, the detection method of error data based on rules is studied, the detection process is analyzed, a common set of detection rules is established, how the SQL statements into the rules is discussed, the detection algorithm is achieved and carried out a series of optimization. This method is easy, its rules are simple, and the efficiency and the false discovery rate are high after optimization. Therefore, this approach may well be a good method of data cleaning.
ISBN:9781612844855
1612844855
DOI:10.1109/ICCSN.2011.6014412