Enriching administrative data using survey data and machine learning techniques

I propose an approach to enrich administrative data with information only available in survey data using machine learning techniques. To illustrate the approach, I replicate a prominent study that used survey data to analyze the federal minimum wage introduction in Germany. In contrast to the origin...

Full description

Saved in:
Bibliographic Details
Published inEconomics letters Vol. 243; p. 111924
Main Author Kunaschk, Max
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.10.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:I propose an approach to enrich administrative data with information only available in survey data using machine learning techniques. To illustrate the approach, I replicate a prominent study that used survey data to analyze the federal minimum wage introduction in Germany. In contrast to the original study, I use the universe of German establishments rather than the limited number of establishments that participated in the survey. As the administrative data do not contain information on whether establishments were treated by the minimum wage, I use a random forest classifier, trained on survey data, to predict the treatment status of establishments. The results obtained using the administrative data are qualitatively similar to the results obtained using the survey data. Beyond replication of previous research, this approach broadens the research potential of administrative data, enabling researchers to explore more detailed research questions at scale. •I propose an approach to enrich administrative data with additional variables.•I use machine learning and survey data to enrich the administrative data.•The approach enables researchers to answer more detailed questions at scale.•To illustrate the approach, I replicate a study on the effects of minimum wages.
ISSN:0165-1765
DOI:10.1016/j.econlet.2024.111924