论文标题
A 65nm 8b激活8B基于SRAM的电荷域计算在内存宏中使用完全并行模拟加法网络和单个ADC接口
A 65nm 8b-Activation 8b-Weight SRAM-Based Charge-Domain Computing-in-Memory Macro Using A Fully-Parallel Analog Adder Network and A Single-ADC Interface
论文作者
论文摘要
在冯·诺伊曼(Von Neumann)体系结构中执行数据密集型任务是由于记忆墙瓶颈而在高性能和功率效率方面具有挑战性的。计算内存(CIM)是一种有希望的缓解方法,它可以在内存中并行的位置内多重蓄能(MAC)操作,并在外围界面和数据patapath的支持下进行。基于SRAM的电荷域CIM(CD-CIM)显示了其提高功率效率和计算精度的潜力。但是,现有的基于SRAM的CD-CIM面临缩放挑战,以满足高性能多位量化应用程序的吞吐量需求。本文介绍了基于SRAM的高通量重新优化的CD-CIM宏。它只能在一个CIM周期中完成两个签名的8B矢量的Mac和Relu,仅一个A/D转换。除了对模拟计算和A/D转换界面的非线性补偿外,这项工作还达到了51.2 gops吞吐量和10.3吨/W的能源效率,同时在CIFAR-10数据集中显示出88.6%的精度。
Performing data-intensive tasks in the von Neumann architecture is challenging to achieve both high performance and power efficiency due to the memory wall bottleneck. Computing-in-memory (CiM) is a promising mitigation approach by enabling parallel in-situ multiply-accumulate (MAC) operations within the memory with support from the peripheral interface and datapath. SRAM-based charge-domain CiM (CD-CiM) has shown its potential of enhanced power efficiency and computing accuracy. However, existing SRAM-based CD-CiM faces scaling challenges to meet the throughput requirement of high-performance multi-bit-quantization applications. This paper presents an SRAM-based high-throughput ReLU-optimized CD-CiM macro. It is capable of completing MAC and ReLU of two signed 8b vectors in one CiM cycle with only one A/D conversion. Along with non-linearity compensation for the analog computing and A/D conversion interfaces, this work achieves 51.2GOPS throughput and 10.3TOPS/W energy efficiency, while showing 88.6% accuracy in the CIFAR-10 dataset.