University of Illinois at Chicago
Browse
- No file added yet -

Addressing Interference and Selection Bias in Causal Inference from Networks

Download (3.03 MB)
thesis
posted on 2024-05-01, 00:00 authored by Zahra Fatemi
Dealing with interference, the problem of treatment “spilling over" from a treated node to a control node is central to many causal inference studies. Prominent methods for network experiment design rely on two-stage randomization, in which sparsely-connected clusters are identified, and cluster randomization dictates the node assignment to treatment and control. However, cluster-based randomization methods face several challenges that can introduce bias into the estimated causal effects. First, it is difficult to separate nodes to treatment and control without leaving a lot of edges with different spillover probabilities across clusters. Second, cluster randomization often does not ensure sufficient node randomization, which can lead to selection bias where treatment and control nodes represent different populations of users. Third, cluster-based randomization approaches perform poorly when interference propagates in cascades, whereby the response of individuals to treatment propagates to their multi-hop neighbors. Fourth, it is hard to isolate parts of a social network for treatment and control without any interactions to measure the direct treatment effect alone. While Randomized Controlled Trials (RCTs) are the gold standard for inferring causal effects, it is often infeasible to perform RCTs due to cost or ethical concerns. Meanwhile, the abundance of observational data makes it an appealing source for estimating the causal effects of interest. However, identification and estimation of causal effects in observational network data are also challenging due to the presence of unmeasured confounders, variables that affect both the treatment and the outcome of interest. In networks, the causal effect of peers' behavior on an individual's outcome, commonly referred to as contagion effects, can be confounded by latent homophily, making it challenging to estimate from observational data. The primary objective of this thesis is to address the challenges of causal effect estimation from network data. First, I propose a principled framework for network experiment design that utilizes weighted graph clustering and cluster matching approaches to jointly minimize interference and selection bias. Then, I introduce a novel cascade-based approach that initiates treatment assignment from the cascade seed node and propagates the assignment to their multi-hop neighbors to limit interference during cascade growth. Additionally, I present a network experiment design that leverages independent sets and assigns treatment and control only to a set of non-adjacent nodes in a graph to disentangle peer effects from the estimation of direct treatment effects. To identify and quantify contagion effects in observational network data with high-dimensional proxies, I develop a framework that integrates variational autoencoders with adversarial networks to create low-dimensional balanced representations of high-dimensional proxy variables for treatment and control nodes.

History

Advisor

Elena Zheleva

Department

Computer Science

Degree Grantor

University of Illinois Chicago

Degree Level

  • Doctoral

Degree name

Doctor of Philosophy

Committee Member

Barbara Di Eugenio Cornelia Caragea Abolfazl Asudeh Martin Saveski

Thesis type

application/pdf

Language

  • en

Usage metrics

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC