New Models and Algorithms for Data Analysis
thesisposted on 27.07.2018 by Benjamin Fish
In order to distinguish essays and pre-prints from academic theses, we have a separate category. These are often much longer text based documents than a paper.
In this thesis, we introduce and analyze new models and new algorithms for problems in data analysis. Many new challenges and constraints for data analysis have arisen as data analysis has become increasingly important. We tackle a few of these challenges here. First, we explore the challenges that arise when data analysis done adaptively, so that previous inferences guide future analyst’s queries. We give faster algorithms for providing accurate responses to data analyst’s adaptive queries. Secondly, we initiate the study of the theory for learning from label proportions, which captures voting and similar non-standard settings. We compare this type of learning to classical supervised machine learning, and show examples for which learning from label proportions is possible. Finally, we tackle learning social networks from voting data, where individuals communicate in order to decide their votes. We show that modeling assumptions on how votes are influenced by the network can be brittle: a small change to the model may drastically change hardness of learning the network, as well as change the resulting learned networks.