dc.description.abstract |
Fast Fourier Transform has been used in machine learning accelerators for efficient im plementation of convolution. In this paper, we introduce a novel architecture known as
the Fast Fourier Transform Based In-Memory Convolution (FTCiM), which harnesses
the power of Static Random-Access Memory (SRAM) to execute Deep Neural Networks
(DNNs) with remarkable efficiency. FTCiM is designed to facilitate the implementa tion of Fourier Transform Convolution directly within SRAM while seamlessly handling
convolutional, fully connected, and pooling layers of Neural Networks, all within the
memory itself.
Our study delves into a comprehensive comparison between the FTCiM architecture
and both the previous state-of-the-art SRAM-based conventional convolution imple mentation, as well as baseline implementations executed on Central Processing Units
(CPUs) and Graphics Processing Units (GPUs). Through rigorous experimentation and
analysis, we have uncovered substantial performance improvements offered by FTCiM.
Specifically, our experimental results reveal that FTCiM excels in multiple key perfor mance metrics when tested with the Inception V3 model. It outperforms the Neural
Cache [1], a prominent alternative, by reducing inference latency by an impressive 19%.
Furthermore, FTCiM boasts an 18% increase in inference throughput, underscoring its
capacity to expedite the processing of complex DNNs. Importantly, it also exhibits a
notable reduction in power consumption, achieving a commendable 26% decrease com pared to Neural Cache. |
en_US |