Machine Learning Using Approximate Computing

Sabir, Muhammad Dilshad

DSpace Home
→
E-Theses
→
CEME
→
Computer Engineering
→
PhD
→
View Item

Machine Learning Using Approximate Computing

Sabir, Muhammad Dilshad

URI: http://10.250.8.41:8080/xmlui/handle/123456789/34748

Date: 2022

Abstract:

Two-dimensional convolution/correlation is the ubiquitous tool to extract features for classification in a variety of machine learning and image processing applications such as CPR (Correlation Pattern Recognition), CNN (Convolution Neural Network), and filter processing. However, using these tools in Internet-of-Things (IoT)–based applications face stringent constraints, like limited memory capacity, inadequate computational resources, and energy resources. The prime objective of this thesis is to propose a set of algorithms and techniques to reduce the computation workload due to an excessive number of correlation or convolution operations in CPRs and CNNs respectively. To achieve this objective, both CPR filters and CNN’s models require their approximated versions without any accuracy degradation. However, the research focuses on obtaining these approximated versions for future IoT implementation. This discretion makes the following contributions: For CNN, (a) to overcome the high computation cost of existing convolution algorithms, a hybrid algorithm is proposed that integrates the unique computational advantages of Winograd and spatial convolution, (b) a Particle of Swarm Convolution Layer Optimization (PSCLO) scaling is proposed that minimize accuracy loss and maximize the reduction in computational workload to combine both approximations, (c) an analysis of experimental results of symmetry and tile quantization approximation in conjunction with PSCLO is performed that finds the trade-off between the intensity of approximation and accuracy degradation. For CPR, (d) a Weight Quantization Retraining (WQR) approach is proposed to retrain low-precision quantization weights of the CPR filter for dynamic fixed point (DFP) and power-of-two(Po2) quantization schemes, additionally, the Particle of Swarm Optimization technique is employed to fine-tune performance parameters, (e) pre-processing strategies of log-polar and inverse log-polar transforms are used to support the low-precision CPR filter quantization, (f) analysis xi is performed to compare the advantages of spatially-filters (ST) and frequency-trained (FT) filters, this analysis is further extended to each domain, either spatially trained or frequency-trained, to investigate the comparative benefits of Po2 and DFP quantization schemes, (g) the overall analysis compares the advantages of direct, log-polar, inverse log-polar, and WQR, which provides a better perspective. For CNN’s, the proposed techniques and algorithms achieved ∼5.28x fewer multiplication operations without significant accuracy loss on ResNet-18. For LeNet, that reduction is ∼3.87x and ∼3.93x on MNIST and Fashion-MNIST respectively. While the additive workload reductions for the above datasets were ∼2.5x and ∼2.56x respectively. For CIFAR-10 quick network, the techniques acquire ∼9.28x and ∼8.82x fewer multiplication on CIFAR-10 and SVHN datasets. The additive workload reductions for these datasets are ∼1.70x and ∼1.33x respectively.For CPR filters, the following results are obtained for a common dataset. For the direct quantization approach, a compression ratio of 8 achieved 4.37x speedup without accuracy loss. However, a compression ratio of 4 with a log-polar implementation achieved 1.12x speedup with 16% accuracy loss. Inverse log-polar with a compression ratio of 16 acquired 8.90x speedup with 6% accuracy loss. These empirical investigations demonstrate the effectiveness of the proposed approximation methods for both CPR and CNN using standard datasets.