Deep Learning Methods for Low-resource Emotion-related Tasks
thesis
posted on 2023-12-01, 00:00authored byMahshid Hosseini
Despite that pre-training on a large corpus of text has manifested substantial gains on many NLP tasks, effectively employing such paradigms depends on the presence of tens of thousands of labeled samples. However, the time-consuming and labor-intensive annotation process has made it challenging to obtain large labeled datasets in real-world scenarios like computer-assisted therapy sessions where the relevant empathy or emotion data might not be present at a large scale. Given the subjective nature of an author’s intent and other intrinsic complexities of empathy and emotions, labeling such data requires a nuanced understanding and interpretation, which is not always straightforward. Individual interpretations of emotions can vary due to personal experiences, cultural backgrounds, and biases. As a result, creating large datasets that reflect a universally accepted labeling of emotional states or empathetic responses is not just expensive but also complex. Furthermore, contextual complexities such as the implicit expression of empathy or emotion which require deeper reasoning beyond mere surface-level lexical patterns, contribute to the scarcity of accurately large labeled datasets in domains that deal with empathy or emotion.
In this thesis, we propose methods to overcome these challenges posed by data scarcity in emotion-related text classification tasks. In our research, we incorporate underlying inductive biases into our models via domain-adaptive pre-training and multi-task learning with knowledge distillation, which result in better performance when integrating data from relevant carefully chosen domains and tasks. Furthermore, we aim to learn not only accurate but also well-calibrated models to improve model trustworthiness in low-resource settings. Specifically, our models provide calibrated confidence so that the low-confidence emotion or empathy examples are passed on to human experts. We achieve this using novel knowledge distillation and data augmentation techniques based on MixUp and contrastive learning. We particularly show that when the labeled data is limited, mixing augmented contrastive samples based on how they contribute to the model learning helps to improve the performance. Last, we present a novel data-agnostic strategy for prompt-based fine-tuning that leverages feature moments as a data augmentation technique and employs training dynamics to allow more informative samples to be concatenated for generating demonstrations as input context.
History
Advisor
Cornelia Caragea
Department
Computer Science
Degree Grantor
University of Illinois Chicago
Degree Level
Doctoral
Degree name
PhD, Doctor of Philosophy
Committee Member
Professor Natalie Parde
Professor Elena Zheleva
Professor Xinhua Zhang
Professor Doina Caragea