Efficient Sparse Matrix-Vector Multiplication Accelerator for Compressed Deep Neural Network Inference

Naseer, Nouman

DSpace Home
→
E-Theses
→
IESE (SCEE)
→
Environmental Engineering
→
MS
→
View Item

Efficient Sparse Matrix-Vector Multiplication Accelerator for Compressed Deep Neural Network Inference

Naseer, Nouman

URI: http://10.250.8.41:8080/xmlui/handle/123456789/35196

Date: 2020

Abstract:

Deep Neural Networks (DNNs) are, in general, very compute and memory intensive and require significant hardware as well as energy/power resources for performing inference. Compression of deep neural networks reduces these requirements by pruning (careful elimination of insignificant network connections) and weight sharing. Numerous works have been carried out in designing hardware accelerators for compressed deep neural network inference. However, these work provide very limited information for reproducing their design in an absolute manner. The main goal of this thesis is to design control and datapath logic of one of the state-of-the-art accelerators, i.e., Efficient Inference Engine (EIE), such that it can efficiently perform sparse matrix-vector multiplication (SpMV) of sparse weight matrix and sparse activation vector by minimizing the intermediate stalls. This thesis will also study different parameters of the EIE architecture in order to propose an inference engine which can, in general, offer superior efficiency as compared to EIE.