论文标题
DHEN:大规模点击率预测的深层和分层集合网络
DHEN: A Deep and Hierarchical Ensemble Network for Large-Scale Click-Through Rate Prediction
论文作者
论文摘要
学习功能互动对于在线广告服务的模型性能很重要。结果,广泛的努力致力于设计有效的架构来学习特征互动。但是,我们观察到,即使声称被捕获的交互的顺序相同,这些设计的实际性能也会因数据集而异。这表明不同的设计可能具有不同的优势,并且它们捕获的交互具有不重叠的信息。在这一观察结果的推动下,我们提出了DHE,这是一种深层,分层的整体结构,可以利用异质互动模块的优势,并学习不同订单下相互作用的层次结构。为了克服Dhen在培训中更深入,多层结构带来的挑战,我们提出了一种新型的共同设计的训练系统,可以进一步提高DHE的训练效率。从CTR预测任务进行的大规模数据集中DHEN的实验在预测的归一化熵(NE)和1.2倍的训练吞吐量上比最先进的基线提高了0.27 \%,这表明它们在实践中的有效性。
Learning feature interactions is important to the model performance of online advertising services. As a result, extensive efforts have been devoted to designing effective architectures to learn feature interactions. However, we observe that the practical performance of those designs can vary from dataset to dataset, even when the order of interactions claimed to be captured is the same. That indicates different designs may have different advantages and the interactions captured by them have non-overlapping information. Motivated by this observation, we propose DHEN - a deep and hierarchical ensemble architecture that can leverage strengths of heterogeneous interaction modules and learn a hierarchy of the interactions under different orders. To overcome the challenge brought by DHEN's deeper and multi-layer structure in training, we propose a novel co-designed training system that can further improve the training efficiency of DHEN. Experiments of DHEN on large-scale dataset from CTR prediction tasks attained 0.27\% improvement on the Normalized Entropy (NE) of prediction and 1.2x better training throughput than state-of-the-art baseline, demonstrating their effectiveness in practice.