论文标题
通过自动随机验证和样本比不匹配检测确保A/B测试质量的大规模测试质量
Ensure A/B Test Quality at Scale with Automated Randomization Validation and Sample Ratio Mismatch Detection
论文作者
论文摘要
eBay的实验平台在任何一天都可以进行数百个A/B测试。该平台与跟踪基础架构和客户体验服务器集成,为实验提供了采样服务,并有责任监视每个A/B测试的进度。有很多挑战,尤其是当需要大规模的实验质量时。我们讨论了两个自动测试质量监测过程和方法论,即使用人群稳定性指数(PSI)和样本比不匹配(又称样品Delta)检测进行随机验证。自动化过程有助于实验平台进行高质量和值得信赖的测试,不仅在大规模上有效,而且通过最大程度地减少对实验者的假阳性监控警报的有效性。
eBay's experimentation platform runs hundreds of A/B tests on any given day. The platform integrates with the tracking infrastructure and customer experience servers, provides the sampling service for experiments, and has the responsibility to monitor the progress of each A/B test. There are many challenges especially when it is required to ensure experiment quality at the large scale. We discuss two automated test quality monitoring processes and methodologies, namely randomization validation using population stability index (PSI) and sample ratio mismatch (a.k.a. sample delta) detection using sequential analysis. The automated processes assist the experimentation platform to run high quality and trustworthy tests not only effectively on a large scale, but also efficiently by minimizing false positive monitoring alarms to experimenters.