论文标题
HINT3:提高栏杆以在野外检测
HINT3: Raising the bar for Intent Detection in the Wild
论文作者
论文摘要
现实世界中的意图检测系统暴露于不平衡数据集的复杂性,这些数据集包含对意图,意外相关和域特异性畸变的不同感知。为了促进可以反映在现实世界情景附近的基准测试,我们介绍了3个由不同域中实时聊天机器人创建的新数据集。与众包的大多数现有数据集不同,我们的数据集包含聊天机器人收到的真实用户查询,并促进了在培训过程中掌握的不需要的相关性。我们评估了4个NLU平台和一个基于BERT的分类器,发现性能在测试集的水平不足下饱和,因为所有系统都锁定训练数据中的意外模式。
Intent Detection systems in the real world are exposed to complexities of imbalanced datasets containing varying perception of intent, unintended correlations and domain-specific aberrations. To facilitate benchmarking which can reflect near real-world scenarios, we introduce 3 new datasets created from live chatbots in diverse domains. Unlike most existing datasets that are crowdsourced, our datasets contain real user queries received by the chatbots and facilitates penalising unwanted correlations grasped during the training process. We evaluate 4 NLU platforms and a BERT based classifier and find that performance saturates at inadequate levels on test sets because all systems latch on to unintended patterns in training data.