通过自适应重新解决的最佳正规化在线分配

论文标题

通过自适应重新解决的最佳正规化在线分配

Optimal Regularized Online Allocation by Adaptive Re-Solving

论文作者

Ma, Wanteng, Cao, Ying, Tsang, Danny H. K., Xia, Dong

论文摘要

本文介绍了一个基于双基的算法框架，用于解决正式化的在线资源分配问题，该问题具有潜在的非符合累积奖励，硬资源限制和不可分割的正常规则。在适应性更新资源约束的策略下，提出的框架仅要求对经验双重问题的解决方案提出一定的准确性，但在本地二阶增长条件下给予了最佳的对数遗憾。令人惊讶的是，对双重目标函数的微妙分析使我们能够消除遗憾中臭名昭著的对数因素。灵活的框架呈现出著名的和计算快速算法，例如双重随机梯度下降。此外，提出了不经常的重新解决方案，该方案大大降低了计算需求，而不会损害最佳的遗憾表现。如果在双重优化过程中没有适应性更新，则建立了最糟糕的平方根遗憾下限，这强调了自适应双重变量更新的关键作用。全面的数值实验证明了所提出的算法框架的优点。

This paper introduces a dual-based algorithm framework for solving the regularized online resource allocation problems, which have potentially non-concave cumulative rewards, hard resource constraints, and a non-separable regularizer. Under a strategy of adaptively updating the resource constraints, the proposed framework only requests approximate solutions to the empirical dual problems up to a certain accuracy and yet delivers an optimal logarithmic regret under a locally second-order growth condition. Surprisingly, a delicate analysis of the dual objective function enables us to eliminate the notorious log-log factor in regret bound. The flexible framework renders renowned and computationally fast algorithms immediately applicable, e.g., dual stochastic gradient descent. Additionally, an infrequent re-solving scheme is proposed, which significantly reduces computational demands without compromising the optimal regret performance. A worst-case square-root regret lower bound is established if the resource constraints are not adaptively updated during dual optimization, which underscores the critical role of adaptive dual variable update. Comprehensive numerical experiments demonstrate the merits of the proposed algorithm framework.

下载PDF全文

下载文献需遵守相关版权规定

论文标题