The essence of human intelligence lies in its ability to learn continuously, accumulating past knowledge to aid in future learning and problem-solving endeavors. In contrast, the current machine learning paradigm often operates in isolation, lacking the capacity for continual learning and adaptation. This deficiency becomes apparent in the face of rapidly evolving artificial intelligence (AI) technologies, particularly large language models (LLMs), where incremental training remains a challenge. Continual learning (CL), also known as lifelong learning, is indispensable for truly intelligent systems, especially in dynamic environments where constant adaptation is necessary. This dissertation explores recent advancements in continual learning algorithms within the framework of language models. We first introduce the settings, challenges, and general approaches of CL. We then delve into our efforts to achieve both catastrophic forgetting (CF) mitigation and knowledge transfer (KT), and how we apply CL to different stages of language model development, including pre-training and end-task adaptation. With the aid of continual learning, the performance of language models is greatly improved. Finally, we will discuss the opportunities for AI autonomy and open-world continual learning.
History
Advisor
Bing Liu
Department
Computer Science
Degree Grantor
University of Illinois Chicago
Degree Level
Doctoral
Degree name
Doctor of Philosophy
Committee Member
X
i
n
h
u
a
Z
h
a
n
g
,
E
l
e
n
a
Z
h
e
l
e
v
a
,
B
r
i
a
n
Z
i
e
b
a
r
t
,
H
u
X
u