Semiparametric count data regression for self‐reported mental health

‘‘For how many days during the past 30 days was your mental health not good?” The responses to this question measure self‐reported mental health and can be linked to important covariates in the National Health and Nutrition Examination Survey (NHANES). However, these count variables present major di...

Full description

Saved in:
Bibliographic Details
Published inBiometrics Vol. 79; no. 2; pp. 1520 - 1533
Main Authors Kowal, Daniel R., Wu, Bohan
Format Journal Article
LanguageEnglish
Published United States Blackwell Publishing Ltd 01.06.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:‘‘For how many days during the past 30 days was your mental health not good?” The responses to this question measure self‐reported mental health and can be linked to important covariates in the National Health and Nutrition Examination Survey (NHANES). However, these count variables present major distributional challenges: The data are overdispersed, zero‐inflated, bounded by 30, and heaped in 5‐ and 7‐day increments. To address these challenges—which are especially common for health questionnaire data—we design a semiparametric estimation and inference framework for count data regression. The data‐generating process is defined by simultaneously transforming and rounding (star) a latent Gaussian regression model. The transformation is estimated nonparametrically and the rounding operator ensures the correct support for the discrete and bounded data. Maximum likelihood estimators are computed using an expectation‐maximization (EM) algorithm that is compatible with any continuous data model estimable by least squares. star regression includes asymptotic hypothesis testing and confidence intervals, variable selection via information criteria, and customized diagnostics. Simulation studies validate the utility of this framework. Using star regression, we identify key factors associated with self‐reported mental health and demonstrate substantial improvements in goodness‐of‐fit compared to existing count data regression models.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0006-341X
1541-0420
DOI:10.1111/biom.13617