论文标题

双Q学习的均值误差

The Mean-Squared Error of Double Q-Learning

论文作者

Weng, Wentao, Gupta, Harsh, He, Niao, Ying, Lei, Srikant, R.

论文摘要

在本文中,我们建立了双Q学习和Q学习的渐近于点误差之间的理论比较。我们的结果基于基于Lyapunov方程的线性随机近似的分析,并适用于表格设置和线性函数近似,但前提是最佳策略是唯一的,并且算法收敛。我们表明,如果双Q学习使用Q学习率的两倍,并输出了两个估计量的平均值,则双Q学习的渐近于点误差完全等于Q学习的误差。我们还使用仿真提出了这种理论观察的一些实际含义。

In this paper, we establish a theoretical comparison between the asymptotic mean-squared error of Double Q-learning and Q-learning. Our result builds upon an analysis for linear stochastic approximation based on Lyapunov equations and applies to both tabular setting and with linear function approximation, provided that the optimal policy is unique and the algorithms converge. We show that the asymptotic mean-squared error of Double Q-learning is exactly equal to that of Q-learning if Double Q-learning uses twice the learning rate of Q-learning and outputs the average of its two estimators. We also present some practical implications of this theoretical observation using simulations.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源