解释翻译：为什么神经分类器更好，他们学到什么？

论文标题

解释翻译：为什么神经分类器更好，他们学到什么？

Explaining Translationese: why are Neural Classifiers Better and what do they Learn?

论文作者

Amponsah-Kaakyire, Kwabena, Pylypenko, Daria, van Genabith, Josef, España-Bonet, Cristina

论文摘要

最近的工作表明，神经特征和表示学习，例如伯特（Bert），与传统手动特征工程的方法相比，具有优越的性能，例如SVM，在TranslationEse分类任务中。以前的研究没有显示$（i）$，无论区别是由于功能，分类器还是两者，以及$（ii）$ $（ii）$ $（II）$。要解决$（i）$，我们仔细设计了在基于BERT和SVM的分类器之间交换功能的实验。我们表明，用手工制作的功能在SVM级别上进行学习和使用手工制作的功能，以最佳的BERT分类器的级别进行了表演的SVM使用。这表明性能差异是由于功能所致。为了解决$（ii）$，我们使用集成梯度，发现$（a）$迹象表明，手工制作的功能捕获的信息只是伯特所学的内容的一个子集，而$（b）$（b）$（b）$（b）的一部分是由于伯特学习主题差异和与transplationese的虚假相关性所致。

Recent work has shown that neural feature- and representation-learning, e.g. BERT, achieves superior performance over traditional manual feature engineering based approaches, with e.g. SVMs, in translationese classification tasks. Previous research did not show $(i)$ whether the difference is because of the features, the classifiers or both, and $(ii)$ what the neural classifiers actually learn. To address $(i)$, we carefully design experiments that swap features between BERT- and SVM-based classifiers. We show that an SVM fed with BERT representations performs at the level of the best BERT classifiers, while BERT learning and using handcrafted features performs at the level of an SVM using handcrafted features. This shows that the performance differences are due to the features. To address $(ii)$ we use integrated gradients and find that $(a)$ there is indication that information captured by hand-crafted features is only a subset of what BERT learns, and $(b)$ part of BERT's top performance results are due to BERT learning topic differences and spurious correlations with translationese.

下载PDF全文

下载文献需遵守相关版权规定

论文标题