dc.description.abstract |
With an ever-increasing demand for faster and more efficient computation in the machine
learning domain, FPGAs emerge as a powerful solution capable of revolutionizing various
applications. This study delves into the intricate interplay between FPGA technology and
machine learning algorithms, unraveling the ways in which FPGA-based acceleration can lead
to real-time inferencing.
This project presents an implementation of a super-resolution (SR) convolutional neural
network (CNN) accelerator on low-cost Zynq-7000 series SoC-based FPGA. On our low-cost
Xilinx FPGA board (Cora Z7), C programming is employed for deploying a trained SR-CNN
model using Vivado high-level synthesis. Key optimization techniques like loop unrolling,
DMA memory access, look-up-table (LUT) partitioning, multiplexer (MUX) incorporation and
integer-8 (int8) quantization are strategically implemented to efficiently utilize the limited onboard
resources, leading to the creation of an optimized IP core. With deterministic latency,
the system consistently delivers high-resolution images with a fixed processing timing. The
inherent parallelism offered by the FPGA and the application specific architecture helps us in
achieving remarkably low processing timing for the execution of our deep neural network for
high quality image reconstruction from low resolution image at the input. The outcomes not
only showcase the viability but also the effectiveness of implementing intricate machine
learning models on cost effective FPGA platforms.
By establishing a solid platform, this project encourages broader adoption and creative
implementation of FPGA-accelerated solutions, ultimately propelling the field of machine
learning into an era of enhanced computation. |
en_US |