论文标题

最佳上下文定价和扩展

Optimal Contextual Pricing and Extensions

论文作者

Liu, Allen, Leme, Renato Paes, Schneider, Jon

论文摘要

在上下文定价问题中,卖方在$ \ mathbb {r}^d $中反复获取了由对抗选择的特征向量描述的产品,并且仅观察购买者对产品的购买者的购买决策。遗憾衡量了卖方可以获得的收入之间的差异,了解买方估值以及学习算法可以获得的收入。 我们给出一种使用$ o(d \ log \ log \ log t + d \ log d)的poly time算法,该价格与$ω(d \ log \ log t)$ lower Boft to $ d \ log d \ log d $添加因子匹配。如果我们用对称损失代替定价损失,我们将获得算法,几乎最佳的遗憾(d \ log d)$匹配$ω(d)$ lower Bond Bond Bond Bond Bond Bond Bond Bond Bond Bond Bond of $ \ log D $。这些算法基于一种新型技术,该技术在各种尺度上界定了凸区域的施泰纳多项式的值。 Steiner多项式是$ d $多项式,并具有内在量为系数。 我们还研究了上下文搜索的广义版本,其中欧几里得空间上隐藏的线性函数被隐藏的函数$ f:\ Mathcal {x} \ rightArrow \ Mathcal \ Mathcal \ Mathcal {y} $中的一定假设类$ \ MATHCAL {H} $。我们提供了一种通用算法,并带有$ O(d^2)$遗憾,其中$ d $是此类的涵盖维度。这特别是$ \ tilde {o}(s^2)$遗憾算法对于线性上下文搜索,如果线性函数保证为$ s $ -s-sparse。最后,我们还将结果扩展到了嘈杂的反馈模型,在此过程中,我们的反馈每一轮都用固定的概率$ p <1/2 $翻转。

In the contextual pricing problem a seller repeatedly obtains products described by an adversarially chosen feature vector in $\mathbb{R}^d$ and only observes the purchasing decisions of a buyer with a fixed but unknown linear valuation over the products. The regret measures the difference between the revenue the seller could have obtained knowing the buyer valuation and what can be obtained by the learning algorithm. We give a poly-time algorithm for contextual pricing with $O(d \log \log T + d \log d)$ regret which matches the $Ω(d \log \log T)$ lower bound up to the $d \log d$ additive factor. If we replace pricing loss by the symmetric loss, we obtain an algorithm with nearly optimal regret of $O(d \log d)$ matching the $Ω(d)$ lower bound up to $\log d$. These algorithms are based on a novel technique of bounding the value of the Steiner polynomial of a convex region at various scales. The Steiner polynomial is a degree $d$ polynomial with intrinsic volumes as the coefficients. We also study a generalized version of contextual search where the hidden linear function over the Euclidean space is replaced by a hidden function $f : \mathcal{X} \rightarrow \mathcal{Y}$ in a certain hypothesis class $\mathcal{H}$. We provide a generic algorithm with $O(d^2)$ regret where $d$ is the covering dimension of this class. This leads in particular to a $\tilde{O}(s^2)$ regret algorithm for linear contextual search if the linear function is guaranteed to be $s$-sparse. Finally we also extend our results to the noisy feedback model, where each round our feedback is flipped with a fixed probability $p < 1/2$.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源