论文标题

DCFIT:基于触发的初始PFC僵局检测数据平面

DCFIT: Initial Trigger-Based PFC Deadlock Detection in the Data Plane

论文作者

Wu, Xinyu Crystal, Ng, T. S. Eugene

论文摘要

最近的数据中心应用程序依靠无损网络来实现高网络性能。但是,无损网络可能会遭受由PFC等跳跃流控制协议引起的网络内僵局。一旦发生僵局,网络的大部分可能会被阻止。现有的解决方案主要集中在避免僵局策略上;不幸的是,它们不是万无一失的。因此,僵局检测是必要的最后手段。在本文中,我们提出了DCFIT,这是一种完全在数据平面上执行的新机制,以检测和解决任意网络拓扑和路由协议的僵局。 DCFIT独有的是使用僵局初始触发器,这有助于预防有效的死锁检测和僵局复发。初步结果表明,DCFIT可以通过最小的开销来快速检测死锁,并有效地减轻同一死锁的复发。这项工作不会提出任何道德问题。

Recent data center applications rely on lossless networks to achieve high network performance. Lossless networks, however, can suffer from in-network deadlocks induced by hop-by-hop flow control protocols like PFC. Once deadlocks occur, large parts of the network could be blocked. Existing solutions mainly center on a deadlock avoidance strategy; unfortunately, they are not foolproof. Thus, deadlock detection is a necessary last resort. In this paper, we propose DCFIT, a new mechanism performed entirely in the data plane to detect and solve deadlocks for arbitrary network topologies and routing protocols. Unique to DCFIT is the use of deadlock initial triggers, which contribute to efficient deadlock detection and deadlock recurrence prevention. Preliminary results indicate that DCFIT can detect deadlocks quickly with minimal overhead and mitigate the recurrence of the same deadlocks effectively. This work does not raise any ethical issues.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源