University of Illinois Chicago
Browse

Utilizing 2D Attention-Based Models for Memory Efficient 3D Reconstruction

Download (2.21 MB)
thesis
posted on 2025-05-01, 00:00 authored by Raj Paresh Mehta
3D reconstruction is a fundamental problem in computer vision and graphics, with applications in virtual reality, robotics, and digital content creation. Classical formulations such as voxel grids and meshes provide structured representations. Unfortunately, these representations present significant challenges in training deep learning models for real time processing due to their discrete structures and sparsity. To address some of the challenges, recent works use Neural Radiance Fields (NeRFs) and 3D Gaussian Splatting representations, which have enabled high-quality scene reconstruction with improved rendering capabilities. Compared to NeRFs, 3D Gaussian Splatting offers real-time rendering and the ability to represent large-scale scenes efficiently. However, one of its key limitations is the high memory and storage requirements, as accurately reconstructing a scene often requires millions of Gaussians. To overcome this, we propose a novel spatial grouping and 2D attention-based framework that learns compressed representations of 3D Gaussian Splatting scenes. Our method significantly reduces memory overhead while preserving visual fidelity and rendering quality, making it a viable solution for efficient and scalable 3D scene representation. We use the CO3D Dataset by Meta AI that contains 1.5 million frames from nearly 19,000 videos capturing objects from 50 MS-COCO categories. The dataset mimics real world settings as it is captured without any coordinate calibration, hence, despite its challenges, it allows us to test our framework close to daily application conditions. Our experimental evaluations indicate that it is possible to compress a 3DGS scene by as much as 16 times without compromising on its visual quality.

History

Advisor

Sathya Ravi

Department

Computer Science

Degree Grantor

University of Illinois Chicago

Degree Level

  • Masters

Degree name

MS, Master of Science

Committee Member

Xinhua Zhang Fabio Miranda

Thesis type

application/pdf

Language

  • en

Usage metrics

    Dissertations and Theses

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC