INTEGRATING MULTIPLE BUILT ENVIRONMENT DATA SOURCES
Studies examining the contribution of the built environment to health often rely on commercial data sources to derive exposure measures such as the number of specific food outlets in study participants' neighborhoods. Data on the location of community amenities (e.g., food outlets) can be colle...
Saved in:
Published in | The annals of applied statistics Vol. 17; no. 2; p. 1722 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
United States
01.06.2023
|
Subjects | |
Online Access | Get more information |
Cover
Loading…
Summary: | Studies examining the contribution of the built environment to health often rely on commercial data sources to derive exposure measures such as the number of specific food outlets in study participants' neighborhoods. Data on the location of community amenities (e.g., food outlets) can be collected from multiple sources. However, these commercial listings are known to have ascertainment errors and thus provide conflicting claims about the number and location of amenities. We propose a method that integrates exposure measures from different databases while accounting for ascertainment errors and obtains unbiased health effects of latent exposure. We frame the problem of conflicting exposure measures as a problem of two contingency tables with partially known margins, with the entries of the tables modeled using a multinomial distribution. Available estimates of source quality were embedded in a joint model for observed exposure counts, latent exposures, and health outcomes. Simulations show that our modeling framework yields substantially improved inferences regarding the health effects. We used the proposed method to estimate the association between children's body mass index (BMi) and the concentration of food outlets near their schools when both the NETS and Reference USA databases are available. |
---|---|
ISSN: | 1932-6157 |
DOI: | 10.1214/22-aoas1692 |