GATEFER：使用输入封闭式变压器加速新闻提要推荐

论文标题

GATEFER：使用输入封闭式变压器加速新闻提要推荐

GateFormer: Speeding Up News Feed Recommendation with Input Gated Transformers

论文作者

Zhang, Peitian, liu, Zheng

论文摘要

新闻提要推荐是重要的网络服务。近年来，预先训练的语言模型（PLM）已被强烈应用以提高建议质量。但是，这些深层模型的利用在许多方面受到限制，例如缺乏解释性和与现有的倒置索引系统不相容。最重要的是，基于PLM的推荐人效率低下，因为用户端信息的编码将占用巨大的计算成本。尽管可以使用有效的变压器或蒸馏PLM来加速计算，但它仍然不足以及时为与超长新闻浏览历史记录相关的活跃用户提出建议。在这项工作中，我们从独特的角度解决了有效的新闻推荐问题。我们认为，与其依赖整个输入（即，用户浏览的新闻文章的收集），我们认为只能用那些代表性的关键字完全捕获用户的兴趣。在此激励的情况下，我们提出了GateFormer，在输入变压器之前，输入数据被门控。门控模块是个性化的，轻巧的和端到端的，以使其可以对内容丰富的用户输入进行准确有效的过滤。 GateFormer在实验中取得了高度令人印象深刻的表现，在该实验中，它在准确性和效率方面尤其超过了现有的加速度方法。我们还出人意料地发现，即使对原始输入的10倍压缩，GateFormer仍然能够使用SOTA方法维持PAR表演。

News feed recommendation is an important web service. In recent years, pre-trained language models (PLMs) have been intensively applied to improve the recommendation quality. However, the utilization of these deep models is limited in many aspects, such as lack of explainability and being incompatible with the existing inverted index systems. Above all, the PLMs based recommenders are inefficient, as the encoding of user-side information will take huge computation costs. Although the computation can be accelerated with efficient transformers or distilled PLMs, it is still not enough to make timely recommendations for the active users, who are associated with super long news browsing histories. In this work, we tackle the efficient news recommendation problem from a distinctive perspective. Instead of relying on the entire input (i.e., the collection of news articles a user ever browsed), we argue that the user's interest can be fully captured merely with those representative keywords. Motivated by this, we propose GateFormer, where the input data is gated before feeding into transformers. The gating module is made personalized, lightweight and end-to-end learnable, such that it may perform accurate and efficient filtering of informative user input. GateFormer achieves highly impressive performances in experiments, where it notably outperforms the existing acceleration approaches in both accuracy and efficiency. We also surprisingly find that even with over 10-fold compression of the original input, GateFormer is still able to maintain on-par performances with the SOTA methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题