Peeking inside the Black Box: Interpretable Machine Learning and Hedonic Rental Estimation

Machine Learning (ML) can detect complex relationships to solve problems in various research areas. To estimate real estate prices and rents, ML represents a promising extension to the hedonic literature since it is able to increase predictive accuracy and is more flexible than the standard regressi...

Full description

Saved in:

Bibliographic Details
Published in	IDEAS Working Paper Series from RePEc
Main Authors	Cajias, Marcelo, Willwersch Jonas, Lorenz, Felix, Fuerst, Franz
Format	Paper
Language	English
Published	St. Louis Federal Reserve Bank of St. Louis 01.01.2021
Subjects	Machine learning
Online Access	Get full text

Cover

More Information
Summary:	Machine Learning (ML) can detect complex relationships to solve problems in various research areas. To estimate real estate prices and rents, ML represents a promising extension to the hedonic literature since it is able to increase predictive accuracy and is more flexible than the standard regression-based hedonic approach in handling a variety of quantitative and qualitative inputs. Nevertheless, its inferential capacity is limited due to its complex non-parametric structure and the ‘black box’ nature of its operations. In recent years, research on Interpretable Machine Learning (IML) has emerged that improves the interpretability of ML applications. This paper aims to elucidate the analytical behaviour of ML methods and their predictions of residential rents applying a set of model-agnostic methods. Using a dataset of 58k apartment listings in Frankfurt am Main (Germany), we estimate rent levels with the eXtreme Gradient Boosting Algorithm (XGB). We then apply Permutation Feature Importance (PFI), Partial Dependence Plots (PDP), Individual Conditional Expectation Curve (ICE) and Accumulated Local Effects (ALE). Our results suggest that IML methods can provide valuable insights and yield higher interpretability of ‘black box’ models. According to the results of PFI, most relevant locational variables for apartments are the proximity to bars, convenience stores and bus station hubs. Feature effects show that ML identifies non-linear relationships between rent and proximity variables. Rental prices increase up to a distance of approx. 3 kilometer to a central bus hub, followed by steep decline. We therefore assume tenants to face a trade-off between good infrastructural accessibility and locational separation from the disamenities associated with traffic hubs such as noise and air pollution. The same holds true for proximity to bar with rents peaking at 1 km distance. While tenants appear to appreciate nearby nightlife facilities, immediate proximity is subject to r
Bibliography:	content type line 50 SourceType-Working Papers-1 ObjectType-Working Paper/Pre-Print-1