In this dissertation, we address the problem of imputing missing and censored values simultaneously in the multivariate data. Motivation of this research is the need to analyze the NHANES where data are subject to missing values and limits of detection. We propose two imputation frameworks to address these issues. The first framework imputes missing and censored values based on the multivariate normality assumption on the full data. The second framework imputes the missing and censored values by separately modeling the outcome variable and the covariates. The model for outcome given covariates can be flexibly specified with the covariates assumed to be normally distributed. A particular focus of the latter framework is on the missing and censored covariate problems in the Cox regression model. The censored survival outcomes and the missing and/or censored covariates are imputed simultaneously in our proposed approach. All proposed approaches are based on the missing at random (MAR) assumption, which assumes the missing values depend only on the observed values, but are unrelated with the unobserved values, and the non-informative censoring assumption, which assumes that the censoring times and the failure times are statistically independent given the observed covariate values. The performance of the proposed approaches are compared through simulation studies with popular existing MI approaches, such as the MICE (Multiple Imputation via Chained Equation) approach. The simulation studies suggest the proposed approaches have better performance in terms of correcting possible bias in the estimator and gaining estimation efficiency. The proposed approaches are applied to two real data examples.
History
Advisor
Chen, Hua Yun
Chair
Chen, Hua Yun
Department
Public Health Sciences-Biostatistics
Degree Grantor
University of Illinois at Chicago
Degree Level
Doctoral
Degree name
PhD, Doctor of Philosophy
Committee Member
Turyk, Mary
Basu, Sanjib
Demirtas, Hakan
Huang, Xin