Fine-Grained Semantic Novelty Detection

Ma, Nianzu

doi:10.25417/uic.22226599.v1

Fine-Grained Semantic Novelty Detection

thesis

posted on 2022-12-01, 00:00 authored by Nianzu Ma

In this thesis, we introduce a new novelty detection problem - fine-grained semantic novelty detection. Specifically, given a text description d in natural language, we detect whether d represents a semantically novel phenomenon or not. An important application of our proposed task is developing engaging software, which contains non-utilitarian features to promote more user engagement. This task is difficult due to three challenges: (1) for both types of d, it is a fine-grained semantic reasoning task, and existing one-class classification models are not capable of doing fine-grained semantic reasoning and perform poorly on our task; (2) when d is a factual text involving named entities, it requires fine-grained reasoning over the relationship between the pair of entities in the textual context and the background knowledge of entities. No existing methods are capable of doing joint reasoning in this scenario; (3) it is a one-class classiﬁcation problem, which is generally more difficult than the traditional classification problem with training instances from all classes. To address the first challenge, we first study a fine-grained reasoning task - Comparative Preference Classification task, which is a supervised learning problem. We propose a new method and show that the GAT-based model is very effective for fine-grained semantic reasoning on a parsed graph representation of a text. Then we study the fine-grained semantic novelty detection for the aforementioned two types of d and propose two tasks: (1) Semantic Novelty Detection in Scene Descriptions (SND-SD) and (2) Semantic Novelty Detection in Factual Texts (SND-FT). For the SND-SD task, we propose a new model called GAT-MA (Graph Attention Network with Max-Margin Loss and Knowledge-based Contrastive Data Augmentation) and a novel data augmentation technique to dynamically create pseudo novel training data, which transforms the one-class classification problem into a supervised learning problem. The GAT-MA model is proved to be effective on our newly created Novel Scene Description Detection dataset. For the SND-FT task, we propose a new model called PAT-SND (Property Attention Network for Semantic Novelty Detection), which leverages both the context information in text and the background knowledge of the named entities in the knowledge base for semantic reasoning. The contrastive data augmentation technique is also employed to transform the training process into a supervised learning process. PAT-SND is proved to be effective on our newly created Novel Factual Text Detection dataset. Finally, we study the novelty characterization for both SND-SD and SND-FT tasks. In the SND-SD task, the novelty characterization is presented by providing semantic novelty scores across all verbs in a scene description. The verb/verbs with low score/scores is/are the reason for novelty. In the SND-FT task, the attention weights of the PAT-SND model provide the most important properties for semantic reasoning. These properties together with their corresponding values provide the reason for novelty. For both novelty characterization, we do quantitative analysis and show that our proposed models' characterization capability outperforms their baselines by large margins.

History

Advisor

Liu, Bing

Chair

Liu, Bing

Department

Computer Science

Degree Grantor

University of Illinois at Chicago

Degree Level

Doctoral

Degree name

PhD, Doctor of Philosophy

Committee Member

Zhang, Xinhua Tang, Wei Parde, Natalie Wang, Shuai

Submitted date

December 2022

Thesis type

application/pdf

Language

en

Usage metrics

Keywords

Novelty Detection Anomaly Detection Semantic Analysis Semantic Reasoning

Licence

In Copyright

Fine-Grained Semantic Novelty Detection

History

Advisor

Chair

Department

Degree Grantor

Degree Level

Degree name

Committee Member

Submitted date

Thesis type

Language

Usage metrics

Categories

Keywords

Licence

Exports