University of Illinois Chicago
Browse

Continual Learning with Language Models

Download (2.5 MB)
thesis
posted on 2024-05-01, 00:00 authored by Zixuan Ke
The essence of human intelligence lies in its ability to learn continuously, accumulating past knowledge to aid in future learning and problem-solving endeavors. In contrast, the current machine learning paradigm often operates in isolation, lacking the capacity for continual learning and adaptation. This deficiency becomes apparent in the face of rapidly evolving artificial intelligence (AI) technologies, particularly large language models (LLMs), where incremental training remains a challenge. Continual learning (CL), also known as lifelong learning, is indispensable for truly intelligent systems, especially in dynamic environments where constant adaptation is necessary. This dissertation explores recent advancements in continual learning algorithms within the framework of language models. We first introduce the settings, challenges, and general approaches of CL. We then delve into our efforts to achieve both catastrophic forgetting (CF) mitigation and knowledge transfer (KT), and how we apply CL to different stages of language model development, including pre-training and end-task adaptation. With the aid of continual learning, the performance of language models is greatly improved. Finally, we will discuss the opportunities for AI autonomy and open-world continual learning.

History

Advisor

Bing Liu

Department

Computer Science

Degree Grantor

University of Illinois Chicago

Degree Level

  • Doctoral

Degree name

Doctor of Philosophy

Committee Member

X i n h u a Z h a n g , E l e n a Z h e l e v a , B r i a n Z i e b a r t , H u X u

Thesis type

application/pdf

Usage metrics

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC