University of Illinois Chicago
Browse

File(s) under embargo

1

year(s)

8

month(s)

19

day(s)

until file(s) become available

Zero-shot Stance Detection

thesis
posted on 2024-08-01, 00:00 authored by Chenye Zhao
Nowadays, people often express their stances toward specific targets on social media, e.g., epidemic prevention, gasoline price, equal rights. The aggregation of stances can reveal valuable information that can gain insights into some important events such as presidential elections. The goal of stance detection is to predict whether people are in favor of, against, or neutral toward a specific target expressed in a piece of text. The goal of zero-shot stance detection (ZSSD) is to identify the stance of a text towards an unseen target in the inference stage. Despite the growing interest in zero-shot stance detection, the task has several limitations. First, current ZSSD works fail to utilize other informative and important targets that are missed by human annotators. Therefore, the training data is not fully utilized. Second, current ZSSD studies focus on limited target types and fail to ensure the domain-level zero-shot setting. Moreover, most recent advances in stance detection are limited to English, and pay little attention to other languages such as Chinese. Third, most of the existing works on multilingual stance detection focus primarily on in-target or cross-target tasks within a small set of targets. The task of multilingual ZSSD, which includes both noun-phrase and claim targets, remains unexplored. Fourth, most previous studies only focus on targets from a single or limited text domains (e.g., financial domain), and thus zero-shot models cannot generalize well to unseen targets of diverse domains. Fifth, while pretrained language models (PLMs) have greatly enhanced stance detection, they remain vulnerable to adversarial attacks—manipulations that maintain textual semantics but lead to incorrect predictions. Such vulnerabilities have not previously been explored in the context of stance detection. In this thesis, we specifically address the above drawbacks. First, we propose a target-based teacher-student framework to perform data augmentation based on targets to improve the diversity of targets during training. Models trained with more diverse targets can better generalize to unseen targets. Second, we explore a more practical yet challenging domain-based ZSSD task where we train a model using a large number of targets from known domains and evaluate it using unseen targets from a completely new domain. Notably, we find that large language models face more challenges in this domain-level zero-shot scenario. We also investigate stances taken toward targets as noun phrases and claims derived from the same text, which is a perspective previously unexplored in this field. To address the data scarcity issue in languages other than English, we present C-STANCE which, to our knowledge, is the first large dataset for Chinese zero-shot stance detection. Extensive experimental results show that C-STANCE is a challenging new benchmark. Third, we investigate bilingual ZSSD, contrasting it with cross-lingual and monolingual ZSSD. We also explore a challenging bilingual scenario with low word overlap between claim targets and texts. To support this, we assemble Bi-STANCE, a bilingual dataset with over 100,000 annotated instances across various domains in Chinese and English. Fourth, we propose a novel dataset generation approach ZeroStance for open-domain stance detection that greatly improves the zero-shot performance by improving the data diversity and requires no training data (text or target) of existing datasets. We present an open-domain dataset CHATStance, which is much more data-efficient and cost-effective. Fifth, we introduce StanceAttack, the first adversarial attack method for stance detection. Using ChatGPT's Chain-of-Thought prompting, we generate human-like adversarial examples that retain the original text's semantics. Our method effectively attacks models with higher success rates and fewer queries. Human evaluations confirm the quality of these examples. Adversarial training with these examples significantly enhances model robustness.

History

Advisor

Cornelia Caragea

Department

Computer Science

Degree Grantor

University of Illinois Chicago

Degree Level

  • Doctoral

Degree name

Doctor of Philosophy

Committee Member

Xinhua Zhang Elena Zheleva FNU Shweta Doina Caragea

Thesis type

application/pdf

Language

  • en

Usage metrics

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC