自动微处理器性能错误检测

论文标题

自动微处理器性能错误检测

Automatic Microprocessor Performance Bug Detection

论文作者

Barboza, Erick Carvajal, Jacob, Sara, Ketkar, Mahesh, Kishinevsky, Michael, Gratz, Paul, Hu, Jiang

论文摘要

处理器设计验证和调试是一项艰巨而复杂的任务，它消耗了狮子在设计过程中的份额。影响处理器性能而不是其功能的设计错误尤其难以捕获，尤其是在新的微体系结构中。这是因为与功能错误不同，在复杂的，长期运行的基准上，新的微体系结构的正确处理器性能通常不确定性。因此，当绩效基准测试新的微体系结构时，尽管设计中存在显着的性能回归，但当新的微体系结构的性能超过上一代时，设计团队可能会认为设计是正确的。在这项工作中，我们提出了一种基于机器学习的两个阶段，能够检测微处理器中性能错误的存在。我们的结果表明，我们的最佳技术可检测到91.5％的微处理器核心性能错误，其在研究应用程序中的平均IPC影响大于1％，而无虫的设计为零误报。当对内存系统错误进行评估时，我们的技术以零假阳性实现100％检测。此外，检测是自动的，需要很少的性能工程师时间。

Processor design validation and debug is a difficult and complex task, which consumes the lion's share of the design process. Design bugs that affect processor performance rather than its functionality are especially difficult to catch, particularly in new microarchitectures. This is because, unlike functional bugs, the correct processor performance of new microarchitectures on complex, long-running benchmarks is typically not deterministically known. Thus, when performance benchmarking new microarchitectures, performance teams may assume that the design is correct when the performance of the new microarchitecture exceeds that of the previous generation, despite significant performance regressions existing in the design. In this work, we present a two-stage, machine learning-based methodology that is able to detect the existence of performance bugs in microprocessors. Our results show that our best technique detects 91.5% of microprocessor core performance bugs whose average IPC impact across the studied applications is greater than 1% versus a bug-free design with zero false positives. When evaluated on memory system bugs, our technique achieves 100% detection with zero false positives. Moreover, the detection is automatic, requiring very little performance engineer time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题