论文标题
一种简单的基于哈希的早期退出方法,用于语言理解和产生
A Simple Hash-Based Early Exiting Approach For Language Understanding and Generation
论文作者
论文摘要
早期退出允许实例根据难度的估计在不同层中退出。以前的工作通常采用启发式指标,例如内部输出的熵来衡量实例难度,这遭受了概括和阈值调节的困扰。相反,学习退出或学习预测实例难度是一种更具吸引力的方法。尽管已经致力于采用此类“学习对象”模块,但仍然未知以及实例难度如何。作为回应,我们首先就实例难度的可学习性进行了实验,这表明现代神经模型在预测实例难度方面的表现较差。基于此观察结果,我们提出了一种基于简单的效果的早期退出方法(Hashee),该方法用哈希函数替换学习对外的模块,以将每个令牌分配到固定的退出层。与以前的方法不同,Hashee不需要内部分类器或其他参数,因此更有效。与先前最先进的早期退出方法相比,分类,回归和发电任务的实验结果表明,Hashee可以通过更少的失败和推理时间实现更高的性能。
Early exiting allows instances to exit at different layers according to the estimation of difficulty. Previous works usually adopt heuristic metrics such as the entropy of internal outputs to measure instance difficulty, which suffers from generalization and threshold-tuning. In contrast, learning to exit, or learning to predict instance difficulty is a more appealing way. Though some effort has been devoted to employing such "learn-to-exit" modules, it is still unknown whether and how well the instance difficulty can be learned. As a response, we first conduct experiments on the learnability of instance difficulty, which demonstrates that modern neural models perform poorly on predicting instance difficulty. Based on this observation, we propose a simple-yet-effective Hash-based Early Exiting approach (HashEE) that replaces the learn-to-exit modules with hash functions to assign each token to a fixed exiting layer. Different from previous methods, HashEE requires no internal classifiers nor extra parameters, and therefore is more efficient. Experimental results on classification, regression, and generation tasks demonstrate that HashEE can achieve higher performance with fewer FLOPs and inference time compared with previous state-of-the-art early exiting methods.