Combining information from two data sources with misreporting and incompleteness to assess hospice-use among cancer patients: a multiple imputation approach

Combining information from multiple data sources can enhance estimates of health‐related measures by using one source to supply information that is lacking in another, assuming the former has accurate and complete data. However, there is little research conducted on combining methods when each sourc...

Full description

Saved in:
Bibliographic Details
Published inStatistics in medicine Vol. 33; no. 21; pp. 3710 - 3724
Main Authors He, Yulei, Landrum, Mary Beth, Zaslavsky, Alan M.
Format Journal Article
LanguageEnglish
Published England Blackwell Publishing Ltd 20.09.2014
Wiley Subscription Services, Inc
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Combining information from multiple data sources can enhance estimates of health‐related measures by using one source to supply information that is lacking in another, assuming the former has accurate and complete data. However, there is little research conducted on combining methods when each source might be imperfect, for example, subject to measurement errors and/or missing data. In a multisite study of hospice‐use by late‐stage cancer patients, this variable was available from patients’ ed medical records, which may be considerably underreported because of incomplete acquisition of these records. Therefore, data for Medicare‐eligible patients were supplemented with their Medicare claims that contained information on hospice‐use, which may also be subject to underreporting yet to a lesser degree. In addition, both sources suffered from missing data because of unit nonresponse from medical record ion and sample undercoverage for Medicare claims. We treat the true hospice‐use status from these patients as a latent variable and propose to multiply impute it using information from both data sources, borrowing the strength from each. We characterize the complete‐data model as a product of an ‘outcome’ model for the probability of hospice‐use and a ‘reporting’ model for the probability of underreporting from both sources, adjusting for other covariates. Assuming the reports of hospice‐use from both sources are missing at random and the underreporting are conditionally independent, we develop a Bayesian multiple imputation algorithm and conduct multiple imputation analyses of patient hospice‐use in demographic and clinical subgroups. The proposed approach yields more sensible results than alternative methods in our example. Our model is also related to dual system estimation in population censuses and dual exposure assessment in epidemiology. Copyright © 2014 John Wiley & Sons, Ltd.
Bibliography:ark:/67375/WNG-73BSMCJC-4
istex:A7564A640ABD3C860E230D4F58556BA83863B4A6
ArticleID:SIM6173
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
wdq7@cdc.gov
ISSN:0277-6715
1097-0258
1097-0258
DOI:10.1002/sim.6173