KHAN-THESIS-2019.pdf (3.3 MB)
Download file

Debiasing 2016 Twitter Election Analysis via Multi-Level Regression and Poststratification (MRP)

Download (3.3 MB)
thesis
posted on 06.08.2019, 00:00 by Shoaib Khan
Sentiment analysis of social media has become very heavily utilized to analyze views of the general population. However, according to Pew research, social media users are not a representative sample of the US population. Such flaws can bias the results of any analysis. Very few studies have attempted to account for demographic biases among Twitter users. By combining approaches from computer science and statistics, we propose a simple but powerful two-step approach to address this gap. The first step predicts the demographic attributes and sentiment of social media users based on their follower networks and tweets. The second step employs multilevel regression and post-stratification (MRP), a well-known statistics approach for debiasing data, to predict the actual proportion of the population holding a particular view. With predicting poll results for key states during the 2016 US presidential election as a case study, we show that social media can make predictions similar to poll.

History

Advisor

Zheleva, Elena

Chair

Zheleva, Elena

Department

Computer Science

Degree Grantor

University of Illinois at Chicago

Degree Level

Masters

Committee Member

Berger-Wolf, Tanya Parde, Natalie

Submitted date

May 2019

Issue date

22/04/2019

Usage metrics

Categories

Exports