Multivariate Regression Forest for Categorical Attribute Data

Aiming at the problem that regression models such as linear regression, SVR and most multivariate regression trees cannot directly use classification attributes for regression analysis, a decision tree node division method that can combine multiple types of attributes is proposed. The center of the...

Full description

Saved in:
Bibliographic Details
Published inJi suan ji ke xue Vol. 49; no. 1; pp. 108 - 114
Main Authors Liu, Zhen-yu, Song, Xiao-ying
Format Journal Article
LanguageChinese
Published Chongqing Guojia Kexue Jishu Bu 01.01.2022
Editorial office of Computer Science
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Aiming at the problem that regression models such as linear regression, SVR and most multivariate regression trees cannot directly use classification attributes for regression analysis, a decision tree node division method that can combine multiple types of attributes is proposed. The center of the set on the classification attribute and the distance from the sample to the center, so that the classification attribute can also participate in the clustering process of the sample like the numerical attribute, so as to form the division of the sample set. The decision tree selects an appropriate ensemble scheme, and the resulting ensemble is called Cluster Regression Forest (CRF). Finally, the regression mean absolute error (MAE) of CRF and other 9 regression models is compared on 12 UCI public datasets and root mean square error (RMSE), the experimental results show that CRF has the best performance among the 10 regression models.
ISSN:1002-137X
DOI:10.11896/jsjkx.201200189