论文标题
DUALDE:嵌入更快,更便宜的推理的双重蒸馏知识图
DualDE: Dually Distilling Knowledge Graph Embedding for Faster and Cheaper Reasoning
论文作者
论文摘要
知识图嵌入(KGE)是一种流行的KG推理方法,并且具有更高维度的训练KGE通常是优先的,因为它们具有更好的推理能力。但是,高维级在存储和计算资源方面构成了巨大挑战,不适合资源有限或时间约束的应用程序,为此,需要更快,更便宜的推理。为了解决这个问题,我们提出了Dualde,这是一种知识蒸馏方法,以从预先训练的高维老师KGE建立低维生的学生KGE。 Dualde考虑了老师和学生之间的双重影响。在Dualde中,我们提出了一种软标签评估机制,以适应不同的软标签和硬标签重量为不同的三元组,以及一种两阶段的蒸馏方法,以改善学生接受教师的接受。我们的二元足够一般,可以应用于各种库。实验结果表明,我们的方法可以成功地将高维kGE的嵌入参数降低7次 - 15次,并在保留高性能的同时将推理速度提高2次 - 6次。我们还通过消融研究在实验上证明了软标签评估机制和两阶段蒸馏方法的有效性。
Knowledge Graph Embedding (KGE) is a popular method for KG reasoning and training KGEs with higher dimension are usually preferred since they have better reasoning capability. However, high-dimensional KGEs pose huge challenges to storage and computing resources and are not suitable for resource-limited or time-constrained applications, for which faster and cheaper reasoning is necessary. To address this problem, we propose DualDE, a knowledge distillation method to build low-dimensional student KGE from pre-trained high-dimensional teacher KGE. DualDE considers the dual-influence between the teacher and the student. In DualDE, we propose a soft label evaluation mechanism to adaptively assign different soft label and hard label weights to different triples, and a two-stage distillation approach to improve the student's acceptance of the teacher. Our DualDE is general enough to be applied to various KGEs. Experimental results show that our method can successfully reduce the embedding parameters of a high-dimensional KGE by 7 times - 15 times and increase the inference speed by 2 times - 6 times while retaining a high performance. We also experimentally prove the effectiveness of our soft label evaluation mechanism and two-stage distillation approach via ablation study.