posted on 2020-08-01, 00:00authored byIvan C Brugere
Networks represent relationships between entities in many complex systems, spanning from online social interactions to biological cell development and brain connectivity. In these applications, we start with a question or hypothesis that the network helps answer about the system: what are the spreading dynamics of a virus through human contact? How do regions of the brain coordinate to achieve certain tasks? How will a drug or treatment affect the proteins interacting within a cell? To what extent is an individual's behavior or preferences explained by their social circle?
Networks are typically derived from underlying data, e.g. sensors, user activity sequences. The network definition that best answers each of these questions is unknown and must be measured over many candidate networks. The many modeling choices in this network definition are often overlooked. Even when the network definition may seem unambiguous (e.g. declared edges in online social networks, co-authorship in publication networks), time scale, tie strength, interaction type, and other challenges all affect the network definition. In short, defining the best network is challenging, and impacts the subsequent analyses in every network application.
Existing approaches use specialized knowledge and rules-of-thumb in different domains to define the network and evaluate its utility. However, current research lacks a rigorous methodology that employs standard statistical validation. In this thesis, I examine (1) how network representations are defined on underlying non-network data, (2) the variety of questions and tasks on these data over several domains, and (3) propose validation strategies for measuring the inferred network's capability of answering questions on the original system of interest. I introduce a task-focused network structure inference and model selection methodology, and a principled measurement of model utility over arbitrary representations and tasks.
Using this general methodology, I apply network structure inference to user reviews and recommendation systems, geo-social networks in behavioral ecology, and data privacy in networks.
History
Advisor
Berger-Wolf, Tanya Y
Chair
Berger-Wolf, Tanya Y
Department
Computer Science
Degree Grantor
University of Illinois at Chicago
Degree Level
Doctoral
Degree name
PhD, Doctor of Philosophy
Committee Member
Getoor, Lise
Kanich, Chris
Zheleva, Elena
Ziebart, Brian