Sato
Proceedings of the 59th ACM/IEEE Design Automation Conference
https://doi.org/10.1145/3489517.3530592Abstract
Event-driven spiking neural networks (SNNs) have shown great promise for being strikingly energy-efficient. SNN neurons integrate the spikes, accumulate the membrane potential, and fire output spike when the potential exceeds a threshold. Existing SNN accelerators, however, have to carry out such accumulationcomparison operation in serial. Repetitive spike generation at each time step not only increases latency as well as overall energy budget, but also incurs memory access overhead of fetching membrane potentials, both of which lessen the efficiency of SNN accelerators. Meanwhile, inherent highly sparse spikes of SNNs lead to imbalanced workloads among neurons that hurdle the utilization of processing elements (PEs). This paper proposes SATO, a temporal-parallel SNN accelerator that accumulates the membrane potential for all time steps in parallel. SATO architecture contains a novel binary adder-search tree to generate the output spike train, which decouples the chronological dependence in the accumulation-comparison operation. Moreover, SATO can evenly dispatch the compressed workloads to all PEs with maximized data locality of input spike trains based on a bucketsort-based method. Our evaluations show that SATO outperforms the previous ANN accelerator 8-bit version of "Eyeriss" by 30.9× in terms of speedup and 12.3×, in terms of energy-saving. Compared with the state-of-the-art SNN accelerator "SpinalFlow", SATO can also achieve 6.4× performance gain and 4.8× energy reduction, which is quite impressive for inference. 1 INTRODUCTION Biologically-inspired spiking neural networks (SNNs) use eventbased models to simulate biological neurons and provide high prediction accuracy with minimal energy consumption [1, 5, 6, 16]. Spiking neurons are the main computing and storage units in SNNs that collect input spikes and emit output spikes according to the membrane potential, like their biological counterparts. Series of spikes, called spike trains, transmit information between neurons by their firing times and firing frequencies. In SNN implementations, the time window of a spike train is divided into time steps to support neuron calculation in a synchronized manner [16].
FAQs
AI
What improvements does SATO offer over existing SNN accelerators in terms of speed?
The paper reveals that SATO achieves an average speedup of 30.9× over the Eyeriss ANN baseline and 6.4× over SpinalFlow, demonstrating superior performance in processing efficiency.
How does SATO address energy efficiency compared to other SNN accelerators?
SATO consumes 91.3% less energy than Eyeriss and 69.7% less than SpinalFlow, attributed to its optimized workload dispatch that minimizes spike overhead.
What role does the bucket-sorting method play in SATO's performance improvements?
The bucket-sorting method effectively balances workloads across processing elements, maximizing data locality and reducing processing delays, which enhances overall system efficiency.
How does SATO's architecture enhance temporal parallelism in SNN computations?
SATO's architecture allows simultaneous integration of spikes across time steps, improving computational scalability and enabling simpler processing elements compared to conventional designs.
What are the major architectural innovations introduced in SATO?
SATO integrates a temporal-parallel dataflow and a novel binary adder-search tree for efficient spike generation, significantly decoupling chronological dependencies in neuron computations.
References (21)
- Filipp Akopyan et al. 2015. Truenorth: Design and tool flow of a 65 mw 1 million neuron programmable neurosynaptic chip. TCAD (2015).
- Ben Varkey Benjamin et al. 2014. Neurogrid: A mixed-analog-digital multichip system for large-scale neural simulations. Proc. of the IEEE (2014).
- Yu-Hsin Chen et al. 2016. Eyeriss: A spatial architecture for energy-efficient dataflow for CNNs. ACM SIGARCH Computer Architecture News (2016).
- Iulia M Comsa et al. 2020. Temporal coding in spiking neural networks with alpha synaptic function. In ICASSP. IEEE.
- Mike Davies et al. 2018. Loihi: A neuromorphic manycore processor with on-chip learning. Ieee Micro (2018).
- Lei Deng et al. 2020. Rethinking the performance comparison between SNNS and ANNS. Neural Networks (2020).
- Wei Fang et al. 2021. Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In ICCV.
- Bing Han and Kaushik Roy. 2021. Deep Spiking Neural Network: Energy Effi- ciency Through Time based Coding. In ECCV. 388-404.
- Alireza Khodamoradi, Kristof Denolf, and Ryan Kastner. 2021. S2N2: A FPGA Accelerator for Streaming Spiking Neural Networks. In FPGA. 194-205.
- Yang Li et al. 2021. BSNN: Towards Faster and Better Conversion of ANNs to SNNs with Bistable Neurons. arXiv (2021).
- Fangxin Liu et al. 2020. SSTDP: Supervised Spike Timing Dependent Plasticity for Efficient Spiking Neural Network Training. Frontiers in Neuroscience (2020).
- Maryam Mirsadeghi et al. 2021. STiDi-BP: Spike time displacement based error backpropagation in multilayer SNNs. Neurocomputing (2021).
- Naveen Muralimanohar et al. 2009. CACTI 6.0: A tool to understand large caches. University of Utah and Hewlett Packard Laboratories, Tech. Rep (2009).
- Surya Narayanan et al. 2020. SpinalFlow: an architecture and dataflow tailored for spiking neural networks. In ISCA. IEEE.
- Seongsik Park and Sungroh Yoon. 2021. Training Energy-Efficient Deep Spiking Neural Networks with Time-to-First-Spike Coding. arXiv (2021).
- Kaushik Roy et al. 2019. Towards spike-based machine intelligence with neuro- morphic computing. Nature (2019).
- Abhronil Sengupta et al. 2019. Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in neuroscience (2019).
- Sonali Singh et al. 2020. NEBULA: a neuromorphic spin-based ultra-low power architecture for SNNs and ANNs. In ISCA. IEEE.
- Synopsys. [Online]. https://www.synopsys.com/community/university- program/teaching-resources.html.
- Amirhossein Tavanaei and Anthony Maida. 2019. BP-STDP: Approximating backpropagation using spike timing dependent plasticity. Neurocomputing (2019).
- Yaman Umuroglu et al. 2018. Bismo: A scalable bit-serial matrix multiplication overlay for reconfigurable computing. In FPL. IEEE.