posted on 2025-08-01, 00:00authored byWenshao Zhong
The performance of a database system heavily depends on its storage engine. The design
of on-disk index data structure is critical to the storage engine performance. Many database
systems have chosen B+-trees as the on-disk index data structure. A B+-tree organizes every
internal node in a block of the I/O unit so that the reads would incur minimum I/Os. However,
writes in B+-trees are slow because a single write often causes multiple blocks to be rewritten
due to the need to maintain its structure. To enable the support for more efficient writes,
many database systems start to adopt LSM-trees in their storage engines. While an LSM-tree
achieves efficient writes by batching small updates to eliminate in-place rewrites, reads in LSM-
trees often require multiple I/Os to read the data scattered across a number of files on the disk
which makes its reads much slower than B+-trees’.
For a long time, reads and writes have been perceived to be a trade-off in designing on-
disk index data structures. It still remains a challenge to achieve efficient reads and writes at
the same time. In this thesis, we explore the idea of using compact indexes on on-disk data
structures that can achieve efficient reads and writes. In this thesis, we introduce three works
that progressively achieve more efficient index data structure designs. The first work REMIX
compactly records and replay the scanning paths to achieve more efficient scanning in LSM-
trees. The second work Disco, built on top of REMIX, achieves more efficient use of I/O for
point and range queries. Disco guarantees that a point query issues at most one I/O while
a short range query needs at most two I/Os, achieving an I/O efficiency close to a B+-tree. The third work extends the design of Disco to achieve an optimal use of I/O for on-disk data
structures.
History
Language
en
Advisor
Jakob Eriksson
Department
Computer Science
Degree Grantor
University of Illinois Chicago
Degree Level
Doctoral
Degree name
PhD, Doctor of Philosophy
Committee Member
Xingbo Wu
Chris Kanich
Abolfazl Asudeh
Stavros Sintos