Enriching administrative data using survey data and machine learning techniques
I propose an approach to enrich administrative data with information only available in survey data using machine learning techniques. To illustrate the approach, I replicate a prominent study that used survey data to analyze the federal minimum wage introduction in Germany. In contrast to the origin...
Saved in:
Published in | Economics letters Vol. 243; p. 111924 |
---|---|
Main Author | |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
01.10.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | I propose an approach to enrich administrative data with information only available in survey data using machine learning techniques. To illustrate the approach, I replicate a prominent study that used survey data to analyze the federal minimum wage introduction in Germany. In contrast to the original study, I use the universe of German establishments rather than the limited number of establishments that participated in the survey. As the administrative data do not contain information on whether establishments were treated by the minimum wage, I use a random forest classifier, trained on survey data, to predict the treatment status of establishments. The results obtained using the administrative data are qualitatively similar to the results obtained using the survey data. Beyond replication of previous research, this approach broadens the research potential of administrative data, enabling researchers to explore more detailed research questions at scale.
•I propose an approach to enrich administrative data with additional variables.•I use machine learning and survey data to enrich the administrative data.•The approach enables researchers to answer more detailed questions at scale.•To illustrate the approach, I replicate a study on the effects of minimum wages. |
---|---|
ISSN: | 0165-1765 |
DOI: | 10.1016/j.econlet.2024.111924 |