University of Illinois Chicago
Browse

TinyProf: Towards Continuous Performance Introspection through Scalable Parallel I/O

Download (3.01 MB)
conference contribution
posted on 2024-07-08, 20:27 authored by Ke Fan, Suraj Kesavan, Steve Petruzza, Sidharth Kumar
Performance profiling tools are crucial for HPC specialists to identify performance bottlenecks in parallel codes at various levels of granularity (i.e., across nodes, ranks, and threads). Although numerous sophisticated profiling tools have been developed, achieving scalable performance introspection on large scales remains a challenge. This is particularly evident in efficiently writing profiles to disk during runtime and subsequently reading them with constrained computing resources for post-hoc analysis. In this paper, we present TinyProf, a performance introspection framework that tackles I/O-related challenges in profiling performance data at scale. TinyProf's scalability is attributed to an optimal runtime that consists of three key components: (1) an efficient in-memory data structure that minimizes memory consumption and decreases communication overhead during parallel file I/O; (2) a customizable three-phase I/O scheme that generates optimal I/O patterns capable of scaling with high core counts; and (3) a streamlined data format for profiles, which guarantees minimal sizes for profile files. These three techniques instill scalability into the profiler, making it low overhead, even at high process counts (less than 5%). This low overhead makes it possible for the profiler to be run with an application as a default (whenever the application is running)-enabling continuous introspection of performance. We demonstrate the efficiency of our framework using large-scale parallel applications and perform a thorough evaluation against existing systems up to 32k processes.

Funding

Collaborative Research: PPoSS: Large: A Full-stack Approach to Declarative Analytics at Scale | Funder: University of Alabama at Birmingham

Collaborative Research: PPoSS: Large: A Full-stack Approach to Declarative Analytics at Scale | Funder: University of Alabama at Birmingham | Grant ID: 2316157

Collaborative Research: SHF: Small: Scalable and Extensible I/O Runtime and Tools for Next Generation Adaptive Data Layouts | Funder: National Science Foundation | Grant ID: 2401274

Collaborative Research: SHF: Small: Scalable and Extensible I/O Runtime and Tools for Next Generation Adaptive Data Layouts | Funder: National Science Foundation | Grant ID: CCF-2401274

History

Citation

Fan, K., Kesavan, S., Petruzza, S.Kumar, S. (2024, May). TinyProf: Towards Continuous Performance Introspection through Scalable Parallel I/O. ISC High Performance 2024 Research Paper Proceedings (39th International Conference) (pp. 1-12). Institute of Electrical and Electronics Engineers (IEEE). https://doi.org/10.23919/isc.2024.10528932

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Usage metrics

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC