强大的Q学习

论文标题

强大的Q学习

Robust Q-learning

论文作者

Ertefaie, Ashkan, McKay, James R., Oslin, David, Strawderman, Robert L.

论文摘要

Q学习是一种基于回归的方法，可广泛用于正式化最佳动态治疗策略的制定。有限维工作模型通常用于估计某些滋扰参数，这些工作模型的错误指定可能导致残留混淆和/或效率损失。我们提出了一种强大的Q学习方法，该方法允许使用数据自适应技术估算此类滋扰参数。我们研究了估计量的渐近行为，并提供了模拟研究，以强调实践中提出方法的需求和实用性。我们使用来自“扩展纳曲酮的治疗有效性”的数据多阶段随机试验来说明我们提出的方法。

Q-learning is a regression-based approach that is widely used to formalize the development of an optimal dynamic treatment strategy. Finite dimensional working models are typically used to estimate certain nuisance parameters, and misspecification of these working models can result in residual confounding and/or efficiency loss. We propose a robust Q-learning approach which allows estimating such nuisance parameters using data-adaptive techniques. We study the asymptotic behavior of our estimators and provide simulation studies that highlight the need for and usefulness of the proposed method in practice. We use the data from the "Extending Treatment Effectiveness of Naltrexone" multi-stage randomized trial to illustrate our proposed methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题