基于流量感知的无线接入网智能切片资源分配方法研究
Research on Traffic-aware Intelligent Slicing Resource Allocation for Radio Access Network
-
摘要: 为保障业务服务等级协议(Service Level Agreements, SLA)需求,使不同应用间差异化的服务质量需求同时得到良好满足,无线接入网(Radio Access Network, RAN)的动态网络资源配置十分关键。本文针对网络流量波动、网络状态快速变化的场景,提出一种结合时间序列预测和深度强化学习的智能化带宽分配策略,通过合理利用长短期记忆(Long Short-Term Memory, LSTM)网络和Dueling深度Q网络(Dueling Deep Q Network, Dueling DQN)以最大限度提高无线接入网切片的频谱效率和SLA满意度。通过使用LSTM网络对切片中数据流量进行预测,可以将深度强化学习算法的计算周期与实际切片配置周期有效解耦。同时为了在保障LSTM性能的同时降低其计算复杂度,以适应RAN中有限的计算资源,本文采用控制神经元连接比例的随机连接长短期记忆(Randomly Connected LSTM, RCLSTM)网络。此外,Dueling DQN与传统DQN相比,能够提高切片策略学习过程的Q值估计精度,以提升收敛速度。仿真结果表明:与原始DQN、优势Actor-Critic(Advantage Actor Critic, A2C)和硬切片方法相比,所提RCLSTM-Dueling DQN方案可以通过提前感知网络性能变化,有效降低网络环境波动对密集流量场景下无线切片资源管理的影响,在具有三种不同流量波动模式及服务质量需求的RAN切片场景中,可以获得收敛速度、频谱效率和切片SLA满意率的明显提升。同时,10%连接比例的RCLSTM网络能在保持极低性能损失的前提下,将原始LSTM的计算时间降低约11%。Abstract: Dynamic network resource allocation is crucial in a radio access network (RAN) to meet the requirements of service-level agreements (SLAs), while simultaneously ensuring that the different service-quality requirements of different applications are effectively met. This paper proposes an intelligent bandwidth-allocation strategy that combines time-series prediction and deep reinforcement learning to address the scenarios of network traffic fluctuation and rapid network state change by reasonably using a long short-term memory (LSTM) network and dueling deep Q network (Dueling DQN) to maximize the spectral efficiency and satisfy the SLA of radio access network slices. By using LSTM networks to predict traffic in slices, the computational cycle of the deep reinforcement learning algorithm can effectively be decoupled from the actual slice configuration cycle. In order to reduce the computational complexity of the LSTM while maintaining its performance to fit the limited computational resources of a RAN, a randomly connected LSTM (RCLSTM) network with a controlled neuronal connectivity ratio is used. In addition, the Dueling DQN can improve the accuracy of the Q-value estimation for the slicing strategy learning process compared to a conventional DQN to enhance the convergence speed. Simulation results showed that the proposed RCLSTM-Dueling DQN scheme could effectively reduce the impact of network-environment fluctuations on wireless slicing resource management in dense traffic scenarios by sensing network performance changes in advance, compared to the original DQN, advantage actor-critic (A2C), and hard slicing approaches. In RAN slicing scenarios with three different traffic fluctuation patterns and three different quality-of-service requirements, significant improvements in the convergence speed, spectral efficiency, and slice SLA satisfaction rates could be achieved. Also, a 10% connected RCLSTM network could reduce the computation time of the original LSTM by approximately 11% while maintaining a very low performance loss.