Resilient Structures and Robust Machine Learning Algorithms
thesisposted on 01.08.2020, 00:00 by Shelby Lynn Heinecke
Networks and learning algorithms are key themes in artificial intelligence (AI). Networks can be used to model the connections of a vast community of internet-of-things (IoT) devices, the internet, and distributed databases, among other important computing contexts for AI. Learning algorithms, which generate trained models, can be strategically designed to learn from a variety of data-rich networks, such as sensor networks, device networks, or even networks of humans working on crowdsourcing tasks. In real-world settings, noise generated from sources such as device inaccuracy or human error are unavoidable. Likewise, attacks in real-world networks, such as the injection of malware, are inevitable due to their ever-evolving sophistication. In this thesis, we study the intrusions of noise and malicious attacks on networks and learning from networks. We devise structural and algorithmic solutions for mitigating the effects of these unwanted intrusions. We first we consider viral propagation under the independent cascade model of infection spread on half-regular bipartite networks and characterize the most resilient structures. Then, we study learning and generalizing from a network of crowd workers, where crowd workers provide erroneous labels to unlabelled data at fixed, unknown error rates. In this setting, we develop a three-step probably approximately correct (PAC) algorithm that incorporates majority voting, pure-exploration bandits, and noisy-PAC learning and demonstrate our algorithm’s improvement over baseline approaches. Finally, we study learning from a network of participants, each with their own distribution on the unlabelled data in the presence of noise. We develop collaborative PAC algorithms robust to classification noise and prove sample complexity bounds. We also study the communication complexity of collaborative PAC learning, with and without classification noise, and develop communication efficient algorithms in both settings.