Q学习算法在机会频谱接入信道选择中的应用

Application of Q-Learning algorithm in channel selection for opportunistic spectrum access

  • 摘要: 针对“先听后传”的机会频谱接入中认知用户的信道选择问题,本文提出了一种基于Q学习的信道选择算法。在非理想感知的条件下,通过建立认知用户的信道选择模型并设计恰当的奖励函数,使智能体能够与未知环境不断交互和学习,进而选择长期累积回报最大的信道接入。在学习过程中,本文引入了Boltzmann实验策略,运用模拟退火思想实现了资源探索与资源利用之间的折衷。仿真结果表明,所提算法能够在未知环境先验知识条件下可以快速选择性能较好的信道接入,有效提高认知用户的接入吞吐量和系统的平均容量。

     

    Abstract: Considering the problem of channel selection for opportunistic spectrum access (OSA), a QLearning based channel selection scheme was proposed for OSA in this paper. A secondary user detected the channels licensed to some primary users periodically before it decided whether to transmit in the OSA system. Under imperfect sensing circumstances, the construction of channel selection model of the secondary user and the designation of an appropriate reward function play a significant role in the continuous interaction and learning between the agent and unknown environment, thus selecting the channel with the maximum cumulative reward. During the learning stage, a Boltzmann learning rule using simulated annealing ideas was employed to realize the tradeoff between channel exploration and exploitation. As the simulation results show, the proposed algorithm can get access to suitable channel, and raise the average system capacity and throughput of the secondary user effectively in the absence of prior knowledge on the channel environment.

     

/

返回文章
返回