Combining information from two data sources with misreporting and incompleteness to assess hospice-use among cancer patients: a multiple imputation approach
Combining information from multiple data sources can enhance estimates of health‐related measures by using one source to supply information that is lacking in another, assuming the former has accurate and complete data. However, there is little research conducted on combining methods when each sourc...
Saved in:
Published in | Statistics in medicine Vol. 33; no. 21; pp. 3710 - 3724 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
England
Blackwell Publishing Ltd
20.09.2014
Wiley Subscription Services, Inc |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Combining information from multiple data sources can enhance estimates of health‐related measures by using one source to supply information that is lacking in another, assuming the former has accurate and complete data. However, there is little research conducted on combining methods when each source might be imperfect, for example, subject to measurement errors and/or missing data. In a multisite study of hospice‐use by late‐stage cancer patients, this variable was available from patients’ ed medical records, which may be considerably underreported because of incomplete acquisition of these records. Therefore, data for Medicare‐eligible patients were supplemented with their Medicare claims that contained information on hospice‐use, which may also be subject to underreporting yet to a lesser degree. In addition, both sources suffered from missing data because of unit nonresponse from medical record ion and sample undercoverage for Medicare claims. We treat the true hospice‐use status from these patients as a latent variable and propose to multiply impute it using information from both data sources, borrowing the strength from each. We characterize the complete‐data model as a product of an ‘outcome’ model for the probability of hospice‐use and a ‘reporting’ model for the probability of underreporting from both sources, adjusting for other covariates. Assuming the reports of hospice‐use from both sources are missing at random and the underreporting are conditionally independent, we develop a Bayesian multiple imputation algorithm and conduct multiple imputation analyses of patient hospice‐use in demographic and clinical subgroups. The proposed approach yields more sensible results than alternative methods in our example. Our model is also related to dual system estimation in population censuses and dual exposure assessment in epidemiology. Copyright © 2014 John Wiley & Sons, Ltd. |
---|---|
Bibliography: | ark:/67375/WNG-73BSMCJC-4 istex:A7564A640ABD3C860E230D4F58556BA83863B4A6 ArticleID:SIM6173 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-1 ObjectType-Feature-2 content type line 23 wdq7@cdc.gov |
ISSN: | 0277-6715 1097-0258 1097-0258 |
DOI: | 10.1002/sim.6173 |