University of Illinois Chicago
Browse

Fast Algorithm For Quantized Convolutional Neural Networks

Download (396.22 kB)
thesis
posted on 2017-11-01, 00:00 authored by Alessandro Pappalardo
Quantization has proven to be a powerful technique to reduce the memory footprint of Con volutional Neural Networks at inference time without sacrifying their accuracy. This is especially useful in the context of embedded and mobile devices, where memory resources comes at a price. E orts have been invested in trying to quantize both weights and activations to a binary value, given that computing convolution between binary sequences requires only bit-wise operations, but keeping acceptable levels of accuracy has proven hard. A bigger quantized representation than binary is needed, but a reduced computational footprint would also be desirable. To this goal, researchers have have expanded bit-wise kernels to non-binary quantized convolution. We explore a di erent approach to this problem based on the application of Number Theoretic Transforms as fast algorithms for the computation of convolution in quantized Convolutional Neural Networks.

History

Advisor

Gmytrasiewicz, Piotr J.

Chair

Gmytrasiewicz, Piotr J.

Department

Computer Science

Degree Grantor

University of Illinois at Chicago

Degree Level

  • Masters

Committee Member

Rao, Wejing Santambrogio, Marco D.

Submitted date

August 2017

Issue date

2017-08-07

Usage metrics

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC