Accelerating Deep Neural Networks Using FPGA

Esraa Adel

Outline

Accelerating Deep Neural Networks Using FPGA

Esraa Adel

2018, 2018 30th International Conference on Microelectronics (ICM)

Abstract

Deep Convolutional Neural Networks (CNNs) are the state-of-the-art systems for image classification and scene understating. They are widely used for their superior accuracy but at the cost of high computational complexity. The target in this field nowadays is its acceleration to be used in real time applications. The solution is to use Graphics Processing Units (GPU) but many problems arise due to the GPU high-power consumption which prevents its utilization in daily-used equipment. The Field Programmable Gate Array (FPGA) is a new solution for CNN implementations due to its low power consumption and flexible architecture. This work discusses this problem and provides a solution that compromises between the speed of the CNN and the power consumption of the FPGA. This solution depends on two main techniques for speeding up: parallelism of layers resources and pipelining inside some layers

Figures (5)

Fig. 2. Alex-Net neural network architecture. D. Local Response Normalization (LRN)

each PE is responsible for one weight filter operations as discussed before. This parallel structure speeds up the Conv] execution by 96 times (number of filters). This will be further explained in the results sections showing the time reduction resulted from this parallelism.

Convl output are completed, that can produce the first row of Pooll output. For the second row of the Pooll can be executed after the fourth and fifth rows of Conv! are completed. Fig.4 shows the pipelining flow in time. After Pooll generates complete row output, Conv! can over-write its output that would reduce the memory storage for the convolution output to store only four rows.

Afi iyv SIMULATION TIME OF Alex-Net ON DIFFERENT GPUs next step is to compromise between the time required for the image prediction and the number of resources. In addition, some techniques can be used to reduce the power consumption as pruning and partial dynamic reconfiguration (PDR).

References (13)

M. Coşkun, A. Uçar, Ö. Yildirim and Y. Demir, "Face recognition based on convolutional neural network," in MEES, 2017.
Z. Chen and X. Huang, "End-to-end learning for lane keeping of self- driving cars," in IVS, 2017.
V. Sze et al., "Efficient Processing of Deep Neural Networks: A Tutorial and Survey," arXiv preprint arXiv:1703.09039, 2017.
Y. Chen, T. Krishna, J. Emer, and V. Sze, "Eyeriss : An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks," in ISSCC, 2016.
A. Elnabawy, H. Abdelmohsen, M. Moustafa, M. Elbediwy, A. Helmy, and H. Mostafa, "A Low Power CORDIC-Based Hardware Implementation of Izhikevich Neuron Model," in IEEE International NEWCAS, 2018.
E. Nurvitadhi et al., "Can FPGAs Beat GPUs in Accelerating Next- Generation Deep Neural Networks," in ISFPGA, 2017.
N. Suda et al., "Throughput-optimized opencl-based fpga accelerator for largescale convolutional neural networks," in FPGA. ACM, 2016. [Online]. Available: http://www.mit.edu/~wsshin/maxwellfdfd.html. [Accessed Nov. 14, 2017].
K. J. Qiu et al., "Going deeper with embedded fpga platform for convolutional neural network," in FPGA. ACM, 2016.
A. Beam, "Deep Learning 101 -Part 1: History and Background, " Feburary , 2017. [Online]. Available: https://beamandrew.github.io/deeplearning/2017/02/23/deep_learning_101_part1. html. [Accessed Sept. 12, 2017].
A. Krizhevsky, "ImageNet Classification with Deep Convolutional Neural Networks, " 2015. [Online]. Available: https://www.coursehero.com/file/24481166/alexnet-tugcekyungheepdf/. [Accessed Nov. 14, 2017].
A. Krizhevsky, I. Sutskever and G. Hinton, "ImageNet classification with deep convolutional neural networks," in NPIS, 2012.
Y. Kang, S. Kim, T. Shin and J. Chung, "Running Convolutional Layers of AlexNet in Neuromorphic Computing System," NRF-2014R1A1A2A16055253.
P. Matthias Gysel, "Ristretto: Hardware-Oriented Approximation of Convolutional Neural Networks," B.S. thesis, Bern University of Applied Sciences ,Switzerland, 2012.

Accelerating Deep Neural Networks Using FPGA

Sign up for access to the world's latest research

Abstract

Related papers

References (13)

Related papers

Related topics