论文标题
通过非易失性记忆提高神经形态计算的可靠性
Improving Dependability of Neuromorphic Computing With Non-Volatile Memory
论文作者
论文摘要
随着过程技术的继续进行积极的扩展,由于负偏置温度稳定性(NBTI)和时间依赖性的电介质分解(TDDB),神经形态硬件的电路老化正在成为一个关键的可靠性问题,并且预计在使用非挥发性内存(NVM)进行突触存储时预计将在使用。这是因为NVM需要高压和电流才能访问其突触重量,这进一步加速了神经形态硬件中的电路老化。当前用于合格可靠性的方法过于保守,因为它们估计了考虑到最坏情况下的操作条件并不必要地限制性能的电路老化。本文提出了Reneu,这是一种以可靠性为导向的方法,将机器学习应用程序映射到神经形态硬件中,目的是提高全系统范围的可靠性,而不损害关键性能指标,例如这些应用程序在硬件上的执行时间。 Reneu的基础是考虑到不同的故障机制,在神经形态硬件中基于CMO的电路衰老的新型表述。使用此公式,Reneu开发了一个全系统的可靠性模型,该模型可以在设计空间探索框架内使用,该探索框架涉及神经元和突触与硬件的映射。为此,Reneu使用粒子群优化的实例(PSO)来生成在性能和可靠性方面帕累托最佳的映射。我们在带有NVM突触的最先进的神经形态硬件上使用不同的机器学习应用程序评估Reneu。我们的结果表明,与当前实践相比,高速公路的平均降低38 \%降低,导致硬件寿命的平均提高了18%。与面向性能的最先进相比,Reneu仅引入了5%的边缘性能开销。
As process technology continues to scale aggressively, circuit aging in a neuromorphic hardware due to negative bias temperature instability (NBTI) and time-dependent dielectric breakdown (TDDB) is becoming a critical reliability issue and is expected to proliferate when using non-volatile memory (NVM) for synaptic storage. This is because an NVM requires high voltage and current to access its synaptic weight, which further accelerates the circuit aging in a neuromorphic hardware. Current methods for qualifying reliability are overly conservative, since they estimate circuit aging considering worst-case operating conditions and unnecessarily constrain performance. This paper proposes RENEU, a reliability-oriented approach to map machine learning applications to neuromorphic hardware, with the aim of improving system-wide reliability without compromising key performance metrics such as execution time of these applications on the hardware. Fundamental to RENEU is a novel formulation of the aging of CMOS-based circuits in a neuromorphic hardware considering different failure mechanisms. Using this formulation, RENEU develops a system-wide reliability model which can be used inside a design-space exploration framework involving the mapping of neurons and synapses to the hardware. To this end, RENEU uses an instance of Particle Swarm Optimization (PSO) to generate mappings that are Pareto-optimal in terms of performance and reliability. We evaluate RENEU using different machine learning applications on a state-of-the-art neuromorphic hardware with NVM synapses. Our results demonstrate an average 38\% reduction in circuit aging, leading to an average 18% improvement in the lifetime of the hardware compared to current practices. RENEU only introduces a marginal performance overhead of 5% compared to a performance-oriented state-of-the-art.