Nowadays, people often express their stances toward various targets (e.g., marijuana legalization, wearing face masks during the COVID-19 pandemic or political figures) in social media. These stances in an aggregate can facilitate important tasks ranging from social media monitoring to predictions of presidential elections. The general goal of stance detection is to identify whether the opinion holder is in favor of, against or neutral to a specific target. Even though stance detection has attracted considerable attention, it is still far from satisfactory in real-world applications. Currently, the scarcity of annotated data has become a major challenge of stance detection.
In this dissertation, we propose to address this data scarcity issue in four different ways. First, we propose two multi-task frameworks for stance detection, which aims to boost the performance of stance detection with the help of auxiliary tasks. Second, since data augmentation is an effective strategy for handling scarce data situations, we then propose two data augmentation methods to augment the stance datasets by generating target-relevant and label-compatible sentences. Third, we present P-Stance, a large stance detection dataset that makes it possible to perform large-scale evaluations. P-Stance dataset can serve as a new benchmark for stance detection and enable multiple stance detection tasks. At last, we evaluate two training strategies for stance detection and propose to further improve the task performance with knowledge distillation. We propose novel knowledge distillation methods that can be applied to not only stance detection, but also other NLP tasks.