Khiops: A Statistical Discretization Method of Continuous Attributes

In supervised machine learning, some algorithms are restricted to discrete data and have to discretize continuous attributes. Many discretization methods, based on statistical criteria, information content, or other specialized criteria, have been studied in the past. In this paper, we propose the d...

Full description

Saved in:
Bibliographic Details
Published inMachine learning Vol. 55; no. 1; pp. 53 - 69
Main Author Boulle, Marc
Format Journal Article
LanguageEnglish
Published Dordrecht Springer Nature B.V 01.04.2004
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In supervised machine learning, some algorithms are restricted to discrete data and have to discretize continuous attributes. Many discretization methods, based on statistical criteria, information content, or other specialized criteria, have been studied in the past. In this paper, we propose the discretization method Khiops,^sup 1^ based on the chi-square statistic. In contrast with related methods ChiMerge and ChiSplit, this method optimizes the chi-square criterion in a global manner on the whole discretization domain and does not require any stopping criterion. A theoretical study followed by experiments demonstrates the robustness and the good predictive performance of the method.[PUBLICATION ABSTRACT]
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ISSN:0885-6125
1573-0565
DOI:10.1023/B:MACH.0000019804.29836.05