论文标题
基于Dynamic SDN的无线电访问网络切片,并具有深入的urllc和EMBB服务的深入增强学习
Dynamic SDN-based Radio Access Network Slicing with Deep Reinforcement Learning for URLLC and eMBB Services
论文作者
论文摘要
无线电访问网络(RAN)切片是一项关键技术,它使5G网络能够支持通用服务的异质要求,即超可靠的低延迟通信(URLLC)和增强的移动宽带(EMBB)。在本文中,我们提出了两个时间尺度运行切片机制,以优化URLLC和EMBB服务的性能。在较大的时间尺度上,SDN控制器根据EMBB和URLLC服务的要求将无线电资源分配给GNODEBS。在短时间内,每个GNODEB将其可用资源分配给其最终用户和请求(如果需要),从相邻的GNODEBS分配了其他资源。我们将此问题提出为非线性二进制程序,并证明其NP硬度。接下来,对于每个时间尺度,我们将问题建模为马尔可夫决策过程(MDP),其中大型尺度被建模为单个代理MDP,而较短的时间尺度则将其建模为多代理MDP。我们利用指数重量算法进行探索和剥削(EXP3)来求解大型MDP的单个Agent MDP和多代理深Q学习算法(DQL)算法,以求解较短的时间表资源分配的多代理MDP。广泛的模拟表明,在不同的网络参数配置下,我们的方法是有效的,并且表现优于最近的基准解决方案。
Radio access network (RAN) slicing is a key technology that enables 5G network to support heterogeneous requirements of generic services, namely ultra-reliable low-latency communication (URLLC) and enhanced mobile broadband (eMBB). In this paper, we propose a two time-scales RAN slicing mechanism to optimize the performance of URLLC and eMBB services. In a large time-scale, an SDN controller allocates radio resources to gNodeBs according to the requirements of the eMBB and URLLC services. In a short time-scale, each gNodeB allocates its available resources to its end-users and requests, if needed, additional resources from adjacent gNodeBs. We formulate this problem as a non-linear binary program and prove its NP-hardness. Next, for each time-scale, we model the problem as a Markov decision process (MDP), where the large-time scale is modeled as a single agent MDP whereas the shorter time-scale is modeled as a multi-agent MDP. We leverage the exponential-weight algorithm for exploration and exploitation (EXP3) to solve the single-agent MDP of the large time-scale MDP and the multi-agent deep Q-learning (DQL) algorithm to solve the multi-agent MDP of the short time-scale resource allocation. Extensive simulations show that our approach is efficient under different network parameters configuration and it outperforms recent benchmark solutions.