在5G网络切片中有效地嵌入VNF：一种深厚的增强学习方法

论文标题

在5G网络切片中有效地嵌入VNF：一种深厚的增强学习方法

Efficient Embedding VNFs in 5G Network Slicing: A Deep Reinforcement Learning Approach

论文作者

Le, Linh, Nguyen, Tu N., Suo, Kun, He, Jing

论文摘要

5G无线电访问网络（RAN）切片旨在从逻辑上将基础架构拆分为一组独立的可编程RAN SLICS，每个切片都建立在基础物理RAN（基板）顶部（基板）的顶部，是一个独立的逻辑移动网络，该网络可提供具有相似特征的一组服务。每个RAN SLICE均由各种虚拟网络函数（VNF）构成，该函数（VNF）在许多基板节点上分布在地理位置上。因此，建立强大的跑步切片的一个关键挑战是，设计了跑步切片（RS） - 配置方案，该方案可以利用诸如基板网络中的资源可用性以及在映射映射（嵌入）VNF之间的相互依存关系（嵌入）VNF上的相互依存关系。有了这样的动机，我们提出了一种机器学习驱动的RAN切片方案，旨在在给定的请求集中适应最大数量的切片（一组连接的虚拟网络功能-VNF）。更具体地说，我们提出了一种被称为“深度分配代理”（DAA）的深入加固计划。简而言之，DAA利用经验设计的深神经网络，该网络观察基板网络的当前状态，并使用优化算法将VNF映射到底物节点的切片。通过使用明确设计的奖励功能，对DAA进行了训练，以最大程度地提高给定设置中的可容纳切片的数量。我们的实验研究表明，在资源有限的底物网络中，DAA平均能够维持超过80％的成功路由切片的速率，在极端条件下，大约60％，即可用资源远小于需求。

5G radio access network (RAN) slicing aims to logically split an infrastructure into a set of self-contained programmable RAN slices, with each slice built on top of the underlying physical RAN (substrate) is a separate logical mobile network, which delivers a set of services with similar characteristics. Each RAN slice is constituted by various virtual network functions (VNFs) distributed geographically in numerous substrate nodes. A key challenge in building a robust RAN slicing is, therefore, designing a RAN slicing (RS)-configuration scheme that can utilize information such as resource availability in substrate networks as well as the interdependent relationships among slices to map (embed) VNFs onto live substrate nodes. With such motivation, we propose a machine-learning-powered RAN slicing scheme that aims to accommodate maximum numbers of slices (a set of connected Virtual Network Functions - VNFs) within a given request set. More specifically, we present a deep reinforcement scheme that is called Deep Allocation Agent (DAA). In short, DAA utilizes an empirically designed deep neural network that observes the current states of the substrate network and the requested slices to schedule the slices of which VNFs are then mapped to substrate nodes using an optimization algorithm. DAA is trained towards the goal of maximizing the number of accommodated slices in the given set by using an explicitly designed reward function. Our experiment study shows that, on average, DAA is able to maintain a rate of successfully routed slices above 80% in a resource-limited substrate network, and about 60% in extreme conditions, i.e., the available resources are much less than the demands.

下载PDF全文

下载文献需遵守相关版权规定

论文标题