From Correlation to Causation: Travel Behavior Modeling with Causal Discovery and Inference
thesis
posted on 2023-08-01, 00:00authored byRishabh S Chauhan
Modeling travel behavior is central to transportation planning. Statistical and machine learning techniques are commonly used for travel behavior modeling. Virtually all these techniques are based on correlations; yet correlation does not necessarily mean causation. Causation-based modeling has been rarely used in the study of travel behavior, in part due to the lack of proper methodology to determine causality. A growing body of research on causal discovery and causal inference has recently emerged that can determine causality from observed data. Causal discovery extracts a graphical causal structure from data, while causal inference estimates the quantitative causal effects between any two variables in a dataset. The goal of this thesis is to advance travel behavior modeling by adopting new causation-based methods . In the context of causality in travel behavior modeling, this thesis has four main chapters:
I. Demonstration of a methodology to conduct a travel survey for collecting data. In this chapter, I explain the detailed procedure adopted to conduct a nationwide online longitudinal survey. I describe this methodology through the CovidFuture survey – a survey to collect information about the shifts in travel-related behavior and attitudes before, during, and after the COVID-19 pandemic in the United States. The survey was deployed to the same respondents three times to collect data on how the responses to the pandemic evolved over time. The survey asked a range of questions regarding commuting, long distance travel, working from home, online learning, online shopping, pandemic experiences, attitudes, and demographic information. The steps discussed in conducting a travel survey include survey design, survey recruitment, data cleaning, and data weighting.
II. Introduction of causal discovery and inference to modeling energy and resources (E&R) demand. I explain the notion of causality in E&R modeling and describe the relevant terminology, methodology, and assumption associated with causal discovery and inference. I demonstrate the use of a causal discovery algorithm and a causal inference technique to three types of E&R: transport (travel mode choice), electricity use, and water consumption. Further, I discuss the opportunities and limitations of the current causal methods to model E&R demand and highlight their potential in guiding future policies to lower E&R demand to build a more sustainable future.
III. Proposal of a novel methodology to study causality in travel mode choice. I propose a novel methodology that combines causal discovery with structural equation modeling (SEM) to study causality in travel mode choice behavior. SEMs have been used in the past to study mode choice; however, the methodology possesses some flaws in its ability to capture causality. The new modeling methodology overcomes some of the limitations of SEM by combining the strengths of both causal discovery and SEM. In this methodology, causal discovery algorithms determine causal graphs from observational data, and SEM estimates the quantitative direct causal effects. The methodology is also used to test the performance of various causal discovery algorithms. Specifically, I tested the performance of four algorithms: Peter-Clark (PC), Fast Causal Inference (FCI), Fast Greedy Equivalence Search (FGES), and Direct Linear Non-Gaussian Acyclic Models (DirectLiNGAM). The results suggest that DirectLiNGAM-based SEM model best captures the causality in mode choice behavior from the 2017 National Household Travel Survey (NHTS) conducted in the New York Metropolitan area.
IV. Comparison of the performance of a causal and a predictive model in modeling travel mode choice. A causal discovery algorithm and a causal inference technique were used to study causality in the mode choice decision making process in three Chicago neighborhoods from Chicago Metropolitan Agency for Planning (CMAP) data. The performance of the causal model was compared with that of an ANN model estimated to do the same task. It was found that both the causal and predictive modeling approaches are useful for the purpose they serve. Further, the study recognizes the unexplored potential of causal modeling in the study of mode choice behavior.
History
Advisor
Derrible, Sybil
Chair
Derrible, Sybil
Department
Civil, Materials, and Environmental Engineering
Degree Grantor
University of Illinois at Chicago
Degree Level
Doctoral
Degree name
PhD, Doctor of Philosophy
Committee Member
Mohammadian, Abolfazl
Zou, Bo
Zheleva, Elena
Pendyala, Ram