论文标题
评估:从精确,召回和F量到ROC,知情,标志性和相关性
Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation
论文作者
论文摘要
常用的评估措施包括召回,精度,F-量表和兰特准确性是偏差的,不应在没有明确了解偏见以及统计量的机会或基本案例水平的情况下使用。使用这些措施,在客观的知情意义上,在这些常用措施中的任何一个都可以表现更好。我们讨论了几种概念和措施,这些概念和措施反映了预测与机会相比的可能性。知情性和引入标记作为对预测的可能性与机会的概率的双重度量。最后,我们展示了知情,标志性,相关性和意义的概念之间的优雅联系,以及它们与回忆和精确度的直觉关系,并概述了从二分法案例到一般多级案例的扩展。
Commonly used evaluation measures including Recall, Precision, F-Measure and Rand Accuracy are biased and should not be used without clear understanding of the biases, and corresponding identification of chance or base case levels of the statistic. Using these measures a system that performs worse in the objective sense of Informedness, can appear to perform better under any of these commonly used measures. We discuss several concepts and measures that reflect the probability that prediction is informed versus chance. Informedness and introduce Markedness as a dual measure for the probability that prediction is marked versus chance. Finally we demonstrate elegant connections between the concepts of Informedness, Markedness, Correlation and Significance as well as their intuitive relationships with Recall and Precision, and outline the extension from the dichotomous case to the general multi-class case.