Decision Trees for Uncertain Data

Traditional decision tree classifiers work with data whose values are known and precise. We extend such classifiers to handle data with uncertain information, which originates from measurement/quantisation errors, data staleness, multiple repeated measurements, etc. The value uncertainty is represen...

Full description

Saved in:

Bibliographic Details
Published in	2009 IEEE 25th International Conference on Data Engineering pp. 441 - 444
Main Authors	Tsang, S., Kao, B., Yip, K.Y., Wai-Shing Ho, Sau Dan Lee
Format	Conference Proceeding
Language	English
Published	IEEE 01.03.2009
Subjects	Buildings c4.5 classification Classification tree analysis Clustering algorithms Computer science Data engineering decision tree Decision trees Probability distribution Quantization Statistical distributions Testing uncertain data
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Traditional decision tree classifiers work with data whose values are known and precise. We extend such classifiers to handle data with uncertain information, which originates from measurement/quantisation errors, data staleness, multiple repeated measurements, etc. The value uncertainty is represented by multiple values forming a probability distribution function (pdf). We discover that the accuracy of a decision tree classifier can be much improved if the whole pdf, rather than a simple statistic, is taken into account. We extend classical decision tree building algorithms to handle data tuples with uncertain values. Since processing pdf's is computationally more costly, we propose a series of pruning techniques that can greatly improve the efficiency of the construction of decision trees.
ISBN:	9781424434220 142443422X
ISSN:	1063-6382 2375-026X
DOI:	10.1109/ICDE.2009.26