政策改变和学习最佳政策

论文标题

政策改变和学习最佳政策

Policy Transforms and Learning Optimal Policies

论文作者

Russell, Thomas M.

论文摘要

我们研究了在不确定环境中选择最佳策略规则的问题，使用可能不完整和/或部分确定的模型。我们考虑一个希望选择一项政策来最大化特定反事实数量的政策制定者，称为策略转换。我们通过存在一条决策规则来表征一组策略选项的可学习性，该决策规则以高概率近似于策略变换的最大值最佳值。为存在这样的规则提供了足够的条件。但是，最佳策略的可学习性是前一个概念（即观察样本之前），因此还提供了某些政策规则的理论保证（即观察样本后）。当未指定不观察到的分布时，我们的整个方法适用，尽管我们讨论了如何使用半摩托限制。最后，我们将程序的可能应用显示为同时的离散选择示例和程序评估示例。

We study the problem of choosing optimal policy rules in uncertain environments using models that may be incomplete and/or partially identified. We consider a policymaker who wishes to choose a policy to maximize a particular counterfactual quantity called a policy transform. We characterize learnability of a set of policy options by the existence of a decision rule that closely approximates the maximin optimal value of the policy transform with high probability. Sufficient conditions are provided for the existence of such a rule. However, learnability of an optimal policy is an ex-ante notion (i.e. before observing a sample), and so ex-post (i.e. after observing a sample) theoretical guarantees for certain policy rules are also provided. Our entire approach is applicable when the distribution of unobservables is not parametrically specified, although we discuss how semiparametric restrictions can be used. Finally, we show possible applications of the procedure to a simultaneous discrete choice example and a program evaluation example.

下载PDF全文

下载文献需遵守相关版权规定

论文标题