NUST Institutional Repository

Energy and Performance Efficient Inference for Neural Networks

Show simple item record

dc.contributor.author Naeem, Mohammad Omer
dc.date.accessioned 2023-08-26T13:18:08Z
dc.date.available 2023-08-26T13:18:08Z
dc.date.issued 2019
dc.identifier.other 117553
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/37588
dc.description Supervisor: Dr. Osman Hasa en_US
dc.description.abstract Model compression is an essential technique for reducing redundancy in deep neural networks (DNNs), that enables their efficient deployment on a variety of hardware ranging from GPU clusters in data centers to highly resource constrained processors in edge devices. Conventionally, compressing a DNN requires creating policies that balance between size, speed and accuracy of the network while considering a particular hardware. A policy is selected after multiple trials and analysis on the domain space making the selection process quite laborious, time-consuming and requiring human expertise. We propose Auto Compress, which mainly utilizes Bayesian optimization to automatically figure out the compression policy based on a combination of pruning and quantization. The strategy of combining structured pruning, unstructured pruning, and quantization is assigned at the beginning, based on the selected hardware characteristics. The proposed, automatically learned compression policy, outperforms hand-crafted policies providing more compression while preserving accuracy. Applying our single-click compression to CIFAR10, on Plain20, we achieve 66.2% FLOPs reduction while getting a Top1 accuracy of 88%, whereas applying our size-focused compression on ResNet-20 achieved 11.2x reduction in size without any loss of accuracy. en_US
dc.language.iso en en_US
dc.publisher School of Electrical Engineering and computer Science (SEECS), NUST en_US
dc.title Energy and Performance Efficient Inference for Neural Networks en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [882]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account