论文标题
使用深度特征对视觉质量评估的可重复性进行批判性分析
Critical analysis on the reproducibility of visual quality assessment using deep features
论文作者
论文摘要
用于培训监督机器学习模型的数据通常分为独立的培训,验证和测试集。本文说明,在无参考图像和视频质量评估文献中发生了复杂的数据泄漏案例。最近,几本期刊上的论文报告了绩效结果远高于该领域最佳的结果。但是,我们的分析表明,来自测试集中的信息以不同的方式在培训过程中不当使用,并且无法实现所要求的绩效结果。纠正数据泄漏时,方法的性能甚至在最先进的边距以下。此外,我们研究了讨论的方法的端到端变化,这些变化并不能改善原始方法。
Data used to train supervised machine learning models are commonly split into independent training, validation, and test sets. This paper illustrates that complex data leakage cases have occurred in the no-reference image and video quality assessment literature. Recently, papers in several journals reported performance results well above the best in the field. However, our analysis shows that information from the test set was inappropriately used in the training process in different ways and that the claimed performance results cannot be achieved. When correcting for the data leakage, the performances of the approaches drop even below the state-of-the-art by a large margin. Additionally, we investigate end-to-end variations to the discussed approaches, which do not improve upon the original.