posted on 2024-05-01, 00:00authored byZahra Fatemi
Dealing with interference, the problem of treatment “spilling over" from a treated node to a control node is central to many causal inference studies.
Prominent methods for network experiment design rely on two-stage randomization, in which sparsely-connected clusters are identified, and cluster randomization dictates the node assignment to treatment and control. However, cluster-based randomization methods face several challenges that can introduce bias into the estimated causal effects. First, it is difficult to separate nodes to treatment and control without leaving a lot of edges with different spillover probabilities across clusters. Second, cluster randomization often does not ensure sufficient node randomization, which can lead to selection bias where treatment and control nodes represent different populations of users.
Third, cluster-based randomization approaches perform poorly when interference propagates in cascades, whereby the response of individuals to treatment propagates to their multi-hop neighbors. Fourth, it is hard to isolate parts of a social network for treatment and control without any interactions to measure the direct treatment effect alone.
While Randomized Controlled Trials (RCTs) are the gold standard for inferring causal effects, it is often infeasible to perform RCTs due to cost or ethical concerns. Meanwhile, the abundance of observational data makes it an appealing source for estimating the causal effects of interest. However, identification and estimation of causal effects in observational network data are also challenging due to the presence of unmeasured confounders, variables that affect both the treatment and the outcome of interest. In networks, the causal effect of peers' behavior on an individual's outcome, commonly referred to as contagion effects, can be confounded by latent homophily, making it challenging to estimate from observational data.
The primary objective of this thesis is to address the challenges of causal effect estimation from network data. First, I propose a principled framework for network experiment design that utilizes weighted graph clustering and cluster matching approaches to jointly minimize interference and selection bias. Then, I introduce a novel cascade-based approach that initiates treatment assignment from the cascade seed node and propagates the assignment to their multi-hop neighbors to limit interference during cascade growth. Additionally, I present a network experiment design that leverages independent sets and assigns treatment and control only to a set of non-adjacent nodes in a graph to disentangle peer effects from the estimation of direct treatment effects. To identify and quantify contagion effects in observational network data with high-dimensional proxies, I develop a framework that integrates variational autoencoders with adversarial networks to create low-dimensional balanced representations of high-dimensional proxy variables for treatment and control nodes.
History
Advisor
Elena Zheleva
Department
Computer Science
Degree Grantor
University of Illinois Chicago
Degree Level
Doctoral
Degree name
Doctor of Philosophy
Committee Member
Barbara Di Eugenio
Cornelia Caragea
Abolfazl Asudeh
Martin Saveski