论文标题
TEDL:一种分类不确定性量化的两阶段证据深度学习方法
TEDL: A Two-stage Evidential Deep Learning Method for Classification Uncertainty Quantification
论文作者
论文摘要
在本文中,我们提出了TEDL,TEDL是一种两阶段的学习方法,以量化分类任务中深度学习模型的不确定性,这是受我们在实验证据深度学习方法(EDL)方法中的发现的启发,这是一种基于Dempster-Shafer理论的最近提出的不确定性量化方法。更具体地说,我们观察到,与通过跨透镜损失学到的模型相比,EDL倾向于产生较低的AUC,并且在训练中高度敏感。这种敏感性可能会导致不可靠的不确定性估计,从而使其对实际应用有风险。为了减轻这两种局限性,我们根据我们对引起这种敏感性的可能原因的分析提出了一种简单而有效的两阶段学习方法,其中第一阶段从跨层损失中学习,然后是从EDL损失中进行的第二阶段学习。我们还通过用ELU替换Relu来避免垂死的Relu问题来重新形成EDL损失。对从大规模的商业搜索引擎收集的各种大小的培训语料库进行了广泛的实验,这表明拟议的两阶段学习框架可以大大提高AUC并大大提高培训的鲁棒性。
In this paper, we propose TEDL, a two-stage learning approach to quantify uncertainty for deep learning models in classification tasks, inspired by our findings in experimenting with Evidential Deep Learning (EDL) method, a recently proposed uncertainty quantification approach based on the Dempster-Shafer theory. More specifically, we observe that EDL tends to yield inferior AUC compared with models learnt by cross-entropy loss and is highly sensitive in training. Such sensitivity is likely to cause unreliable uncertainty estimation, making it risky for practical applications. To mitigate both limitations, we propose a simple yet effective two-stage learning approach based on our analysis on the likely reasons causing such sensitivity, with the first stage learning from cross-entropy loss, followed by a second stage learning from EDL loss. We also re-formulate the EDL loss by replacing ReLU with ELU to avoid the Dying ReLU issue. Extensive experiments are carried out on varied sized training corpus collected from a large-scale commercial search engine, demonstrating that the proposed two-stage learning framework can increase AUC significantly and greatly improve training robustness.