University of Illinois Chicago
Browse

Continual Learning Using Out-of-Distribution Detection

Download (913.57 kB)
thesis
posted on 2023-12-01, 00:00 authored by Gyuhak Kim
Continual learning (CL) is a machine learning paradigm that focuses on enabling AI systems to learn from a sequence of tasks while retaining the acquired knowledge during the process to become capable of performing all the learned tasks. Although humans possess a remarkable ability to acquire new knowledge for a new task while remaining capable of performing the previously learned tasks, it remains extremely difficult for neural networks. A main challenge of CL is the issue of catastrophic forgetting (CF), which is the phenomenon where the AI system completely forgets the previous knowledge after learning a new task. Several methods have been proposed for the two main learning scenarios in CL. The first is task-incremental learning (TIL). This scenario assumes that the task information of a test instance is given at inference. On the other hand, class-incremental learning (CIL) assumes no task information at testing. In this thesis, we focus on CIL, which is especially challenging due to the additional problem of finding decision boundaries between the classes of different tasks without access to the previous task data during training and task-id during testing. In this thesis, we first conduct a theoretical study on how to solve CIL. We show that the CIL problem can be decomposed into two problems. The first one is the within-task prediction (WP) problem. WP is in fact equivalent to TIL. The other is the task-id prediction (TP) problem. We further show that TP can be solved by out-of-distribution (OOD) detection, and similarly, OOD detection can be solved by TP. These findings establish necessary and sufficient conditions for CIL: a good CIL method produces good WP and TP performances, and a good WP and TP make a good CIL. Finally, based on the problem decomposition, we show that CIL is actually learnable. Several highly effective CIL techniques are designed to demonstrate the effectiveness of our theoretical analysis. The first two methods are called HAT+CSI and Sup+CSI. They combine the existing OOD detection technique CSI with the TIL methods HAT and SupSup. The proposed methods are based on ResNet-18 and outperform strong baselines in both TIL and CIL by large margins. Two additional methods are called Multi-head model for continual learning via OOD REplay (MORE) and Replay, OOD, and WP for CIL (ROW). The two methods are similar to each other as both use the replay samples in a memory buffer to train OOD detection models, but ROW is more principled and stronger than MORE as it is based on the decomposition. The two methods outperform strong baselines. They are also effective with very few replay samples and are naturally capable of detecting OOD instances while classifying the continually learned classes. We demonstrate our proposed methods via extensive experiments using the standard public benchmark datasets: MNIST, CIFAR, and Tiny-ImageNet.

History

Advisor

Bing Liu

Department

Computer Science

Degree Grantor

University of Illinois Chicago

Degree Level

  • Doctoral

Degree name

PhD, Doctor of Philosophy

Committee Member

Xinhua Zhang Brian Ziebart Sathya Ravi Sahisnu Mazumder

Thesis type

application/pdf

Usage metrics

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC