posted on 2024-08-01, 00:00authored byAnshuman Misra
This thesis is a comprehensive study of causal ordering and causality detection algorithms in the Byzantine failure model across a variety of system settings. This includes proving impossibility results for problems that may appear to be solvable and presenting solutions in the form of algorithms with correctness proofs when problems are solvable. The first part of this thesis deals with Byzantine fault-tolerant causality detection. Causality detection is a fundamental problem in distributed computing and deals with determining an ordering across events in a distributed execution. Determining causality is a critical tool in reasoning about and designing distributed systems. Further, there are many applications of causality detection including determining consistent recovery points in distributed databases, deadlock detection, termination detection, distributed predicate detection, distributed debugging and monitoring, and the detection of race conditions and other synchronization errors. In this thesis, we prove a fundamental impossibility result for the causality detection problem - It is impossible to detect causality across events under Byzantine failures in an asynchronous system. In light of this result, we developed a causality detection protocol for synchronous systems. This protocol establishes another fundamental result - the causality detection problem is solvable in synchronous systems. The second part of this thesis deals with Byzantine fault-tolerant causal ordering. The focus is ensuring that message delivery at correct processes does not violate application level semantics characterized by causal relationships. Applications of causal ordering include distributed data stores, fair resource allocation, and collaborative applications such as multiplayer online gaming, social networks, event notification systems, group editing of documents, and distributed virtual environments. We prove a variety of solvability/impossibility results and provide seven causal ordering algorithms along with correctness proofs across different system settings.