Semi-Supervised Machine Learning & Deep Learning Models in Crisis-Related Informativeness Classification
thesisposted on 01.12.2019, 00:00 by Alessandro Rennola
This study examines the impact of several state-of-the-art Machine Learning and Deep Learning techniques in the context of semi-supervised disaster-related Twitter mining. The goal is to create a model able to successfully classify informative tweets in the context of natural and human-induced disasters by employing several Machine Learning (Naive Bayes and Support-Vector Machines) and Deep Learning (Convolutional Neural Networks, Bidirectional Long Short-Term Memory) mechanisms. Firstly, we evaluate the performance of supervised instances. Subsequently, the supervised models are extended to assess the impact of semi-supervised techniques (self-training for NB, SVM, CNN; Virtual Adversarial Loss for Bi-LSTM). The accuracy of our Bi-LSTM model peaks at 0.961 in the English dataset, and 0.969 in the Italian dataset. In our knowledge, our semi-supervised learning models for informativeness classification outperform other supervised state-of-the-art models. Finally, our conclusions are drawn as a means to provide a meaningful starting point for future research opportunities.