部分可观测时空混沌系统的无模型预测

论文标题

部分可观测时空混沌系统的无模型预测

Optimistic Online Convex Optimization in Dynamic Environments

论文作者

Meng, Qing-xin, Liu, Jian-wei

论文摘要

在本文中，我们研究了动态环境中乐观的在线凸优化问题。现有作品表明，Ader享受$ o \ left（\ sqrt {\ left（1+p_t \ right）t} \ right）$动态遗憾上限，其中$ t $是回合的数量，$ p_t $是参考策略序列的路径长度。但是，ADER不是环境自适应的。基于乐观主义为实施环境自适应提供了一个框架，我们替换了贪婪投影（GP）和ADER中归一化的指定子级别（NES），分别具有乐观的GP和乐观的-NES，并将相应的算法命名为算法。我们还将加倍的技巧扩展到自适应技巧，并自然而然地引入了三个特征术语，即$ m_t $，$ \ wideTilde {m} _t $和$ v_t+1_ {l^2ρρ\ left（ρ+2 p_t+2 p_t \ right）\ leqslant \ varrant \ varrant \ varrant \ varrant \ vartho^2 vt。遗憾的是上限$ t $。我们用自适应技巧及其亚级别变化版本详细介绍了One-OGP，所有版本都是环境自适应的。

In this paper, we study the optimistic online convex optimization problem in dynamic environments. Existing works have shown that Ader enjoys an $O\left(\sqrt{\left(1+P_T\right)T}\right)$ dynamic regret upper bound, where $T$ is the number of rounds, and $P_T$ is the path length of the reference strategy sequence. However, Ader is not environment-adaptive. Based on the fact that optimism provides a framework for implementing environment-adaptive, we replace Greedy Projection (GP) and Normalized Exponentiated Subgradient (NES) in Ader with Optimistic-GP and Optimistic-NES respectively, and name the corresponding algorithm ONES-OGP. We also extend the doubling trick to the adaptive trick, and introduce three characteristic terms naturally arise from optimism, namely $M_T$, $\widetilde{M}_T$ and $V_T+1_{L^2ρ\left(ρ+2 P_T\right)\leqslant\varrho^2 V_T}D_T$, to replace the dependence of the dynamic regret upper bound on $T$. We elaborate ONES-OGP with adaptive trick and its subgradient variation version, all of which are environment-adaptive.

下载PDF全文

下载文献需遵守相关版权规定

论文标题