关于神经形态硬件体系结构的生物学上合理的学习

论文标题

关于神经形态硬件体系结构的生物学上合理的学习

Biologically Plausible Learning on Neuromorphic Hardware Architectures

论文作者

Wolters, Christopher, Taylor, Brady, Hanson, Edward, Yang, Xiaoxuan, Schlichtmann, Ulf, Chen, Yiran

论文摘要

随着数量不断增长的参数定义了日益复杂的网络，深度学习导致了几个突破，超过了人类的绩效。结果，这些数百万个模型参数的数据移动会导致不平衡的记忆壁不平衡。神经形态计算是一种新兴的范式，它通过直接在模拟记忆中进行计算来面对这种失衡。在软件端，顺序反向传播算法可防止有效并行化，从而阻止了快速收敛。一种新颖的方法，即直接反馈对齐，通过将误差从输出直接传递到每个层来解决固有的层依赖性。在硬件/软件共同设计的交集中，需要开发可忍受硬件非理想性的算法。因此，这项工作探讨了对神经形态硬件实施生物学知识学习的相互关系，强调能量，区域和潜伏期限制。使用基准测试框架DNN+Neurosim，我们研究了硬件非理想性和量化对算法性能的影响，以及网络拓扑和算法级别的设计选择如何扩展芯片的延迟，能源和面积消耗。据我们所知，这项工作是第一个比较不同学习算法对基于内存的硬件的影响，反之亦然。准确性取得的最佳结果仍然基于反向传播，尤其是在面对硬件瑕疵时。另一方面，直接反馈对齐允许由于并行化而大加速，从而减少了训练时间，而训练时间接近n的n层网络。

With an ever-growing number of parameters defining increasingly complex networks, Deep Learning has led to several breakthroughs surpassing human performance. As a result, data movement for these millions of model parameters causes a growing imbalance known as the memory wall. Neuromorphic computing is an emerging paradigm that confronts this imbalance by performing computations directly in analog memories. On the software side, the sequential Backpropagation algorithm prevents efficient parallelization and thus fast convergence. A novel method, Direct Feedback Alignment, resolves inherent layer dependencies by directly passing the error from the output to each layer. At the intersection of hardware/software co-design, there is a demand for developing algorithms that are tolerable to hardware nonidealities. Therefore, this work explores the interrelationship of implementing bio-plausible learning in-situ on neuromorphic hardware, emphasizing energy, area, and latency constraints. Using the benchmarking framework DNN+NeuroSim, we investigate the impact of hardware nonidealities and quantization on algorithm performance, as well as how network topologies and algorithm-level design choices can scale latency, energy and area consumption of a chip. To the best of our knowledge, this work is the first to compare the impact of different learning algorithms on Compute-In-Memory-based hardware and vice versa. The best results achieved for accuracy remain Backpropagation-based, notably when facing hardware imperfections. Direct Feedback Alignment, on the other hand, allows for significant speedup due to parallelization, reducing training time by a factor approaching N for N-layered networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题