Decision Trees for Uncertain Data

Traditional decision tree classifiers work with data whose values are known and precise. We extend such classifiers to handle data with uncertain information, which originates from measurement/quantisation errors, data staleness, multiple repeated measurements, etc. The value uncertainty is represen...

Full description

Saved in:
Bibliographic Details
Published in2009 IEEE 25th International Conference on Data Engineering pp. 441 - 444
Main Authors Tsang, S., Kao, B., Yip, K.Y., Wai-Shing Ho, Sau Dan Lee
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.03.2009
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Traditional decision tree classifiers work with data whose values are known and precise. We extend such classifiers to handle data with uncertain information, which originates from measurement/quantisation errors, data staleness, multiple repeated measurements, etc. The value uncertainty is represented by multiple values forming a probability distribution function (pdf). We discover that the accuracy of a decision tree classifier can be much improved if the whole pdf, rather than a simple statistic, is taken into account. We extend classical decision tree building algorithms to handle data tuples with uncertain values. Since processing pdf's is computationally more costly, we propose a series of pruning techniques that can greatly improve the efficiency of the construction of decision trees.
ISBN:9781424434220
142443422X
ISSN:1063-6382
2375-026X
DOI:10.1109/ICDE.2009.26