雾辅助物联网系统中的历史感知在线缓存位置：学习和控制的集成

论文标题

雾辅助物联网系统中的历史感知在线缓存位置：学习和控制的集成

History-Aware Online Cache Placement in Fog-Assisted IoT Systems: An Integration of Learning and Control

论文作者

Gao, Xin, Huang, Xi, Tang, Yinxu, Shao, Ziyu, Yang, Yang

论文摘要

在雾化的物联网系统中，通常在网络边缘缓存流行内容以实现高质量服务是一种常见的做法。由于实践中的不确定性，例如未知文件受欢迎程度，缓存位置方案设计仍然是一个未解决的挑战的开放问题：1）如何维持预算下的时间平均的存储成本，2）如何合并在线学习以帮助缓存放置以最大程度地减少绩效损失（又称遗憾），以及如何利用历史上的历史信息来降低遗憾的遗憾。在本文中，我们将未知文件受欢迎的缓存位置问题作为约束组合多臂强盗（CMAB）问题。为了解决该问题，我们采用虚拟队列技术来管理时间平均的存储成本限制，并采用历史吸引的强盗学习方法，将离线历史信息集成到在线学习程序中，以处理勘探 - 开发权的权衡。通过在线控制和历史吸引在线学习的有效组合，我们使用名为CPHBL的历史吸引的强盗学习设计了一种缓存计划。我们的理论分析和模拟表明，CPHBL实现了sublenear时间平均遗憾的束缚。此外，仿真结果验证了CPHBL对基于强化学习的方法的优势。

In Fog-assisted IoT systems, it is a common practice to cache popular content at the network edge to achieve high quality of service. Due to uncertainties in practice such as unknown file popularities, cache placement scheme design is still an open problem with unresolved challenges: 1) how to maintain time-averaged storage costs under budgets, 2) how to incorporate online learning to aid cache placement to minimize performance loss (a.k.a. regret), and 3) how to exploit offline historical information to further reduce regret. In this paper, we formulate the cache placement problem with unknown file popularities as a constrained combinatorial multi-armed bandit (CMAB) problem. To solve the problem, we employ virtual queue techniques to manage time-averaged storage cost constraints, and adopt history-aware bandit learning methods to integrate offline historical information into the online learning procedure to handle the exploration-exploitation tradeoff. With an effective combination of online control and history-aware online learning, we devise a Cache Placement scheme with History-aware Bandit Learning called CPHBL. Our theoretical analysis and simulations show that CPHBL achieves a sublinear time-averaged regret bound. Moreover, the simulation results verify CPHBL's advantage over the deep reinforcement learning based approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题