University of Illinois Chicago
Browse

Predictive Modeling of Application Runtime in Dragonfly Systems

thesis
posted on 2025-05-01, 00:00 authored by Pietro Lodi Rizzini
The Dragonfly interconnect is widely adopted by extreme-scale systems, yet its sharing nature often results in trac from various applications competing for network resources, causing workload interference and leading to variable application runtime. This work aims to leverage deep neural network methods to forecast application iteration times, using network features collected at the router port level. The problem is addressed by employing graph neural network-based dynamic models that are trained on an ad-hoc graph structure that reflects the physical characteristics of the system, and can capture its temporal and structural dynamics. Results show that this methodology is able to outperform the baselines for one and two future steps ahead. However, it faces scalability challenges when applied to larger systems. To address these limitations, the methodology was enhanced by constructing an ensemble model that integrates a custom GNN-based component with the recently proposed TimeLLM framework, which leverages large language models for time series forecasting.

History

Advisor

Zhiling Lan

Department

Computer Science

Degree Grantor

University of Illinois Chicago

Degree Level

  • Masters

Degree name

MS, Master of Science

Committee Member

Sourav Medya Danilo Ardagna

Thesis type

application/pdf

Language

  • en

Usage metrics

    Dissertations and Theses

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC