Detecting Deception in Online Social Networks
thesisposted on 2015-02-27, 00:00 authored by Jalal S. Alowibdi
Online Social Networks (OSNs) play a significant role in the daily life of hundreds of millions of people's. However, many user profiles in OSNs contain deceptive information. Existing studies have shown that lying in OSNs is quite widespread, often for protecting a user's privacy. In this dissertation, we propose a novel approach for detecting deceptive profiles in OSNs. Our ultimate goal is to find deceptive information about user gender and location. We specifically define a set of analysis methods for detecting deceptive information about user genders and locations in Twitter. First, we collected a large dataset of Twitter profiles and tweets. Next, we defined methods for gender guessing from Twitter profile colors and names. Our methods are quite scalable because we avoid the analysis of text messages, which typiclly involves high computational complexity. We applied a number of preprocessing methods to raw Twitter data in ways that significantly enhanced the accuracy of our predictions. Subsequently, we applied Bayesian classification and K-means clustering algorithms to Twitter profile characteristics (e.g., profile layout colors, first names, user names) and geolocations to analyze user behavior. We established the overall accuracy of each gender indicator through extensive experimentations with our crawled dataset.