论文标题

部分可观测时空混沌系统的无模型预测

Optimistic Online Convex Optimization in Dynamic Environments

论文作者

Meng, Qing-xin, Liu, Jian-wei

论文摘要

在本文中,我们研究了动态环境中乐观的在线凸优化问题。现有作品表明,Ader享受$ o \ left(\ sqrt {\ left(1+p_t \ right)t} \ right)$动态遗憾上限,其中$ t $是回合的数量,$ p_t $是参考策略序列的路径长度。但是,ADER不是环境自适应的。基于乐观主义为实施环境自适应提供了一个框架,我们替换了贪婪投影(GP)和ADER中归一化的指定子级别(NES),分别具有乐观的GP和乐观的-NES,并将相应的算法命名为算法。我们还将加倍的技巧扩展到自适应技巧,并自然而然地引入了三个特征术语,即$ m_t $,$ \ wideTilde {m} _t $和$ v_t+1_ {l^2ρρ\ left(ρ+2 p_t+2 p_t \ right)\ leqslant \ varrant \ varrant \ varrant \ varrant \ vartho^2 vt。遗憾的是上限$ t $。我们用自适应技巧及其亚级别变化版本详细介绍了One-OGP,所有版本都是环境自适应的。

In this paper, we study the optimistic online convex optimization problem in dynamic environments. Existing works have shown that Ader enjoys an $O\left(\sqrt{\left(1+P_T\right)T}\right)$ dynamic regret upper bound, where $T$ is the number of rounds, and $P_T$ is the path length of the reference strategy sequence. However, Ader is not environment-adaptive. Based on the fact that optimism provides a framework for implementing environment-adaptive, we replace Greedy Projection (GP) and Normalized Exponentiated Subgradient (NES) in Ader with Optimistic-GP and Optimistic-NES respectively, and name the corresponding algorithm ONES-OGP. We also extend the doubling trick to the adaptive trick, and introduce three characteristic terms naturally arise from optimism, namely $M_T$, $\widetilde{M}_T$ and $V_T+1_{L^2ρ\left(ρ+2 P_T\right)\leqslant\varrho^2 V_T}D_T$, to replace the dependence of the dynamic regret upper bound on $T$. We elaborate ONES-OGP with adaptive trick and its subgradient variation version, all of which are environment-adaptive.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源