University of Illinois at Chicago
KHAN-THESIS-2019.pdf (3.3 MB)
Download file

Debiasing 2016 Twitter Election Analysis via Multi-Level Regression and Poststratification (MRP)

Download (3.3 MB)
posted on 2019-08-06, 00:00 authored by Shoaib Khan
Sentiment analysis of social media has become very heavily utilized to analyze views of the general population. However, according to Pew research, social media users are not a representative sample of the US population. Such flaws can bias the results of any analysis. Very few studies have attempted to account for demographic biases among Twitter users. By combining approaches from computer science and statistics, we propose a simple but powerful two-step approach to address this gap. The first step predicts the demographic attributes and sentiment of social media users based on their follower networks and tweets. The second step employs multilevel regression and post-stratification (MRP), a well-known statistics approach for debiasing data, to predict the actual proportion of the population holding a particular view. With predicting poll results for key states during the 2016 US presidential election as a case study, we show that social media can make predictions similar to poll.



Zheleva, Elena


Zheleva, Elena


Computer Science

Degree Grantor

University of Illinois at Chicago

Degree Level

  • Masters

Committee Member

Berger-Wolf, Tanya Parde, Natalie

Submitted date

May 2019

Issue date


Usage metrics


    No categories selected