University of Illinois Chicago
Browse

Missing Data Analysis with Mixture Missing Mechanisms and Robust Inferences

Download (4.31 MB)
thesis
posted on 2022-05-01, 00:00 authored by Yu-Che Chung
This dissertation aims to provide flexible missing data models and robust methods for handling missing data. Two methods are proposed, developed and studied. First, a mixture missing data mechanisms is developed where we assume flexible missing data assumptions at individual level. This proposed model can greatly reduce bias when the proportions of missing mechanisms are correctly specified and demonstrates strong performance even when the proportions of missing mechanisms are misspecified. Extensive simulation studies are conducted to assess the proposed method. We examine the method for cross-sectional type data as well as longitudinal type data. We also consider non-parametric regression for potential nonlinear relationships between the outcome and covariates. We found mixture missing mechanisms can cause unstable estimates when assuming missing not at random (MNAR), while stable and low bias estimates are seen if proper mixture mechanisms are assumed. Other scenarios including under-fitting/over-fitting and intensity of missingness in data generating model are also assessed in the simulation. Second, a missing data analysis often utilizes an outcome model and a missingness model. The well-known doubly robustness property provides asymptotic protection against misspecification of one of these two models. Little and An (2004), Zhang and Little (2009) and others proposed penalized spline of propensity prediction (PSPP) method that provides double robustness for predicting marginal and conditional means under missing at random (MAR). We develop multiple penalized spline of propensity prediction (mPSPP) method which incorporates multiple propensity scores models. We establish that mPSPP is robust against propensity score misspecification. We also develop stratified mPSPP that provides multiple robustness for marginal and conditional means. This approach uses regularization when a large number of propensity score models are included in the prediction model. We assess performance of the proposed mPSPP methodology in both non-regularized and regularized settings in extensive numerical studies. The PArTNER study is presented as a real data example and both proposed methods are applied to the study to show their applicability. Other standard missing data methods are also included and compared in the real data example.

History

Advisor

Basu, Sanjib

Chair

Basu, Sanjib

Department

Public Health Sciences-Biostatistics

Degree Grantor

University of Illinois at Chicago

Degree Level

  • Doctoral

Degree name

PhD, Doctor of Philosophy

Committee Member

Berbaum, Michael Bhaumik, Dulal Demirtas, Hakan Krishnan, Jerry

Submitted date

May 2022

Thesis type

application/pdf

Language

  • en

Usage metrics

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC