Analysis of multiple-variable missing-not-at-random survey data for child lead surveillance using NHANES

Background Although ongoing, multi‐topic surveys form the basis of public health surveillance in many countries, their utility for specific subject matter areas can be limited by high proportions of missing data. For example, the National Health and Examination Survey is the main resource for survei...

Full description

Saved in:
Bibliographic Details
Published inStatistics in medicine Vol. 35; no. 29; pp. 5417 - 5429
Main Authors Roberts, Eric M., English, Paul B.
Format Journal Article
LanguageEnglish
Published England Blackwell Publishing Ltd 20.12.2016
Wiley Subscription Services, Inc
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Background Although ongoing, multi‐topic surveys form the basis of public health surveillance in many countries, their utility for specific subject matter areas can be limited by high proportions of missing data. For example, the National Health and Examination Survey is the main resource for surveillance of elevated blood lead levels (EBLLs) in US children, but key predictor variables are missing for as many as 35% of respondents. Methods Using a Bayesian framework, we formulate a t‐distributed Heckman selection model applicable to the case of multiple missing‐not‐at‐random variables in the context of a complex survey design. We demonstrate the utility of the results by calculating prevalence estimates for lead levels exceeding 2.5, 5.0, and 10.0 µg/dL among children 1 to 5 years of age for a variety of time points and geographies by applying the coefficients to data from the American Community Survey from the US Census. Results We present a protocol for estimating posterior distributions of parameters using Gibbs and grid sampling steps. Stark disparities in the prevalence of EBLL by race/ethnicity, age of housing, and poverty are readily quantified, and three‐ to five‐fold differences in predicted prevalence across geographies within the US are presented. Conclusions We are able to conduct multivariate analyses of EBLLs that incorporate the crucial variable age of housing, analyses that have not been previously available using these data. This represents an expansion of the utility of National Health and Examination Survey that is likely to be relevant to many similar ongoing, multi‐topic health surveillance efforts. Copyright © 2016 John Wiley & Sons, Ltd.
Bibliography:Supporting info item
istex:97EFD6279CF1F62D659C348249A586A338B33E1F
ArticleID:SIM7067
ark:/67375/WNG-69040KSX-S
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
ISSN:0277-6715
1097-0258
1097-0258
DOI:10.1002/sim.7067