Analysis of partially observed clustered data using generalized estimating equations and multiple imputation

Clustered data arise in many settings, particularly within the social and biomedical sciences. For example, multiple-source reports are commonly collected in child and adolescent psychiatric epidemiologic studies where researchers use various informants (for instance, parents and adolescents) to pro...

Full description

Saved in:
Bibliographic Details
Published inStata Journal Vol. 14; no. 4; pp. 863 - 883
Main Authors Aloisio, Kathryn M, Micali, Nadia, Swanson, Sonja A, Field, Alison, Horton, Nicholas J
Format Journal Article
LanguageEnglish
Published 2014
Edition199
Subjects
Online AccessGet full text
ISSN1536-8634
DOI10.22004/ag.econ.267105

Cover

Abstract Clustered data arise in many settings, particularly within the social and biomedical sciences. For example, multiple-source reports are commonly collected in child and adolescent psychiatric epidemiologic studies where researchers use various informants (for instance, parents and adolescents) to provide a holistic view of a subject’s symptoms. Fitzmaurice et al. (1995, American Journal of Epidemiology 142: 1194–1203) have described estimation of multiple-source models using a standard generalized estimating equation (GEE) framework. However, these studies often have missing data because additional stages of consent and assent are required. The usual GEE is unbiased when data are missing completely at random in the context of Little and Rubin (2002, Statistical Analysis with Missing Data [Wiley]). This is a strong assumption that may not be tenable. Other options, such as the weighted GEE, are computationally challenging when missingness is nonmonotone. Multiple imputation is an attractive method to fit incomplete data models while requiring only the less restrictive missing-at-random assumption. Previously, estimation of partially observed clustered data was computationally challenging. However, recent developments in Stata have facilitated using them in practice. We demonstrate how to use multiple imputation in conjunction with a GEE to investigate the prevalence of eating disorder symptoms in adolescents as reported by parents and adolescents and to determine the factors associated with concordance and prevalence. The methods are motivated by the Avon Longitudinal Study of Parents and their Children, a cohort study that enrolled more than 14,000 pregnant mothers in 1991–92 and has followed the health and development of their children at regular intervals. While point estimates for the missing-at-random model were fairly similar to those for the GEE under missing completely at random, the missing-at-random model had smaller standard errors and required less stringent assumptions regarding missingness.
AbstractList Clustered data arise in many settings, particularly within the social and biomedical sciences. For example, multiple-source reports are commonly collected in child and adolescent psychiatric epidemiologic studies where researchers use various informants (for instance, parents and adolescents) to provide a holistic view of a subject’s symptoms. Fitzmaurice et al. (1995, American Journal of Epidemiology 142: 1194–1203) have described estimation of multiple-source models using a standard generalized estimating equation (GEE) framework. However, these studies often have missing data because additional stages of consent and assent are required. The usual GEE is unbiased when data are missing completely at random in the context of Little and Rubin (2002, Statistical Analysis with Missing Data [Wiley]). This is a strong assumption that may not be tenable. Other options, such as the weighted GEE, are computationally challenging when missingness is nonmonotone. Multiple imputation is an attractive method to fit incomplete data models while requiring only the less restrictive missing-at-random assumption. Previously, estimation of partially observed clustered data was computationally challenging. However, recent developments in Stata have facilitated using them in practice. We demonstrate how to use multiple imputation in conjunction with a GEE to investigate the prevalence of eating disorder symptoms in adolescents as reported by parents and adolescents and to determine the factors associated with concordance and prevalence. The methods are motivated by the Avon Longitudinal Study of Parents and their Children, a cohort study that enrolled more than 14,000 pregnant mothers in 1991–92 and has followed the health and development of their children at regular intervals. While point estimates for the missing-at-random model were fairly similar to those for the GEE under missing completely at random, the missing-at-random model had smaller standard errors and required less stringent assumptions regarding missingness.
Author Micali, Nadia
Horton, Nicholas J
Swanson, Sonja A
Field, Alison
Aloisio, Kathryn M
Author_xml – sequence: 1
  fullname: Aloisio, Kathryn M
– sequence: 2
  fullname: Micali, Nadia
– sequence: 3
  fullname: Swanson, Sonja A
– sequence: 4
  fullname: Field, Alison
– sequence: 5
  fullname: Horton, Nicholas J
BookMark eNqdjktuwkAQRGdBFpCwZtsXwPEHDFuEgjgAe6tjN2akdo-ZnkEip8cGTpBVPdVTSTUzE3FCxiyyNMnzNF19Y5tQ7STJy02Wridmmq2Lcrkti9XU8E6Q72oV3Bl69MEi8x3cr5K_UQM1Rw3kB2owIES10kJLQh7Z_g01abAdhrGmaxzAiQJKA13kYHsmsF0fw1N8mY8zstL8nZ8mO_yc9sdl7KTCdjxZObRvVEJfX6rRUROr1_viP5sHR2Fbnw
ContentType Journal Article
DBID JAG
DOI 10.22004/ag.econ.267105
DatabaseName AgEcon Search Free
DatabaseTitleList
Database_xml – sequence: 1
  dbid: JAG
  name: AgEcon Search Free
  url: http://ageconsearch.umn.edu/
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
Edition 199
EndPage 883
ExternalDocumentID oai_ageconsearch_umn_edu_267105
GroupedDBID JAG
ID FETCH-umn_agecon_oai_ageconsearch_umn_edu_2671053
IEDL.DBID JAG
ISSN 1536-8634
IngestDate Mon Apr 07 03:11:18 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Issue 4
Language English
LinkModel DirectLink
MergedId FETCHMERGED-umn_agecon_oai_ageconsearch_umn_edu_2671053
OpenAccessLink https://ageconsearch.umn.edu/record/267105/
ParticipantIDs umn_agecon_oai_ageconsearch_umn_edu_267105
PublicationCentury 2000
PublicationDate 2014
PublicationDateYYYYMMDD 2014-01-01
PublicationDate_xml – year: 2014
  text: 2014
PublicationDecade 2010
PublicationTitle Stata Journal
PublicationYear 2014
Score 3.0525134
Snippet Clustered data arise in many settings, particularly within the social and biomedical sciences. For example, multiple-source reports are commonly collected in...
SourceID umn
SourceType Open Access Repository
StartPage 863
SubjectTerms ALSPAC study
eating disorders
generalized estimating equations
missing at random
missing completely at random
missing data
multiple imputation
multiple informants
Research Methods/Statistical Methods
weighted estimating equations
SubjectTermsDisplay ALSPAC study
eating disorders
generalized estimating equations
missing at random
missing completely at random
missing data
multiple imputation
multiple informants
Research Methods/Statistical Methods
weighted estimating equations
Title Analysis of partially observed clustered data using generalized estimating equations and multiple imputation
URI https://ageconsearch.umn.edu/record/267105/
Volume 14
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwED1BWVgQCBDfuoEJyTTEjknGqqJUSDCB1C2KHaeqlDpQmgF-PXdxizrCFtmJY99TdO-s5xeAa0c1QGLlvXCRLYRKCyOIVWfCSK2srTIVOS4Un1_0-E09TZLJymebz8LQR2RZRtyh3M79r5ZmUfZjTekw6W_DDh-3DEb5j8EPVYtUSxWse2IGnga67SrK8BBnjrnfyByjfdhbUT4cBIwOYMv5Q6jXbiDYVPjO-BV1_YWN4V1SV6KtW7YwoCvWcCLL06c4DR7Rs29qZncMZpvU7D6CX_cnFr7EtUQQZ_y_hq7jCO5GD6_DsaDZ5WHteWf4vBGGnPsoDHlYijyGnm-8OwHMKHsXxLlsbFJlqyqVlvifiZzNpCYeeAo3fx_37D83n8MuvVuFvYgL6C0XrbukGC_NVYfKD6jVmxI
linkProvider University of Minnesota
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Analysis+of+partially+observed+clustered+data+using+generalized+estimating+equations+and+multiple+imputation&rft.jtitle=Stata+Journal&rft.au=Aloisio%2C+Kathryn+M&rft.au=Micali%2C+Nadia&rft.au=Swanson%2C+Sonja+A&rft.au=Field%2C+Alison&rft.date=2014&rft.issn=1536-8634&rft.volume=14&rft.issue=4&rft.spage=863&rft.epage=883&rft_id=info:doi/10.22004%2Fag.econ.267105&rft.externalDBID=JAG&rft.externalDocID=oai_ageconsearch_umn_edu_267105