Fast Algorithm For Quantized Convolutional Neural Networks

Pappalardo, Alessandro

Fast Algorithm For Quantized Convolutional Neural Networks

thesis

posted on 2017-11-01, 00:00 authored by Alessandro Pappalardo

Quantization has proven to be a powerful technique to reduce the memory footprint of Con volutional Neural Networks at inference time without sacrifying their accuracy. This is especially useful in the context of embedded and mobile devices, where memory resources comes at a price. E orts have been invested in trying to quantize both weights and activations to a binary value, given that computing convolution between binary sequences requires only bit-wise operations, but keeping acceptable levels of accuracy has proven hard. A bigger quantized representation than binary is needed, but a reduced computational footprint would also be desirable. To this goal, researchers have have expanded bit-wise kernels to non-binary quantized convolution. We explore a di erent approach to this problem based on the application of Number Theoretic Transforms as fast algorithms for the computation of convolution in quantized Convolutional Neural Networks.

History

Advisor

Gmytrasiewicz, Piotr J.

Chair

Gmytrasiewicz, Piotr J.

Department

Computer Science

Degree Grantor

University of Illinois at Chicago

Degree Level

Masters

Committee Member

Rao, Wejing Santambrogio, Marco D.

Submitted date

August 2017

Issue date

2017-08-07

Usage metrics

Keywords

convolution quantization deep learning fast algorithms

Licence

In Copyright

Fast Algorithm For Quantized Convolutional Neural Networks

History

Advisor

Chair

Department

Degree Grantor

Degree Level

Committee Member

Submitted date

Issue date

Usage metrics

Categories

Keywords

Licence

Exports