Dealing with Dangerous Data: Part-Whole Validation for Low Incident, High Risk Data

In certain situations, syntactically valid, but incorrect, data entered into a database can result in near-immediate, catastrophic financial losses for an organization. Examples include: omitting zeros in prices of goods on e-commerce sites; and financial fraud where data is directly entered into da...

Full description

Saved in:
Bibliographic Details
Published inJournal of database management Vol. 27; no. 1; pp. 29 - 57
Main Authors Chua, Cecil Eng Huang, Storey, Veda C
Format Journal Article
LanguageEnglish
Published Hershey IGI Global 01.01.2016
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In certain situations, syntactically valid, but incorrect, data entered into a database can result in near-immediate, catastrophic financial losses for an organization. Examples include: omitting zeros in prices of goods on e-commerce sites; and financial fraud where data is directly entered into databases, bypassing application-level financial checks. Such “dangerous data” can, and should, be detected, because it deviates substantially from the statistical properties of existing data. Detection of this kind of problem requires comparing individual data items to a large amount of existing data in the database at run-time. Furthermore, the identification of errors is probabilistic, rather than deterministic, in nature. This research proposes part-whole validation as an approach to addressing the dangerous data situation. Part-whole validation addresses fundamental issues in database management, for example, integrity maintenance. Illustrative and representative examples are first defined, and analyzed. Then, an architecture for part-whole validation is presented and implemented in a prototype to illustrate the feasibility of the research.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1063-8016
1533-8010
DOI:10.4018/JDM.2016010102