Missing Data Imputation for Geolocation-based Price Prediction Using KNN–MCF Method

Accurate house price forecasts are very important for formulating national economic policies. In this paper, we offer an effective method to predict houses’ sale prices. Our algorithm includes one-hot encoding to convert text data into numeric data, feature correlation to select only the most correl...

Full description

Saved in:
Bibliographic Details
Published inISPRS international journal of geo-information Vol. 9; no. 4; p. 227
Main Authors Sanjar, Karshiev, Bekhzod, Olimov, Kim, Jaesoo, Paul, Anand, Kim, Jeonghong
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.04.2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Accurate house price forecasts are very important for formulating national economic policies. In this paper, we offer an effective method to predict houses’ sale prices. Our algorithm includes one-hot encoding to convert text data into numeric data, feature correlation to select only the most correlated variables, and a technique to overcome the missing data. Our approach is an effective way to handle missing data in large datasets with the K-nearest neighbor algorithm based on the most correlated features (KNN–MCF). As far as we are concerned, there has been no previous research that has focused on important features dealing with missing observations. Compared to the typical machine learning prediction algorithms, the prediction accuracy of the proposed method is 92.01% with the random forest algorithm, which is more efficient than the other methods.
ISSN:2220-9964
2220-9964
DOI:10.3390/ijgi9040227