KHAN-THESIS-2019.pdf (3.3 MB)
Debiasing 2016 Twitter Election Analysis via Multi-Level Regression and Poststratification (MRP)
thesis
posted on 2019-08-06, 00:00 authored by Shoaib KhanSentiment analysis of social media has become very heavily utilized to analyze views of the general population. However, according to Pew research, social media users are not a representative sample of the US population. Such flaws can bias the results of any analysis. Very few studies have attempted to account for demographic biases among Twitter users. By combining approaches from computer science and statistics, we propose a simple but powerful two-step approach to address this gap. The first step predicts the demographic attributes and sentiment of social media users based on their follower networks and tweets. The second step employs multilevel regression and post-stratification (MRP), a well-known statistics approach for debiasing data, to predict the actual proportion of the population holding a particular view. With predicting poll results for key states during the 2016 US presidential election as a case study, we show that social media can make predictions similar to poll.
History
Advisor
Zheleva, ElenaChair
Zheleva, ElenaDepartment
Computer ScienceDegree Grantor
University of Illinois at ChicagoDegree Level
- Masters
Committee Member
Berger-Wolf, Tanya Parde, NatalieSubmitted date
May 2019Issue date
2019-04-22Usage metrics
Categories
No categories selectedLicence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC