论文标题
利用数据集的潜力:一种以数据为中心的模型鲁棒性方法
Exploiting the Potential of Datasets: A Data-Centric Approach for Model Robustness
论文作者
论文摘要
深度神经网络(DNNS)对恶意扰动的鲁棒性是值得信赖的AI的热门话题。现有技术通过修改模型结构或通过优化推理或培训过程来获得固定数据集的强大模型。尽管已经进行了重大改进,但构建高质量数据集的模型鲁棒性的可能性仍未得到探索。遵循Andrew NG启动的以数据为中心的AI的运动,我们提出了一种用于数据集增强的新颖算法,该算法适用于许多现有的DNN模型,以提高鲁棒性。我们的优化数据集包括可转移的对抗示例和14种常见损坏。在阿里巴巴集团和Tsinghua大学举办的以数据为中心的强大学习竞赛中,我们的算法在第一阶段中排名第三,而我们在第二阶段排名第四。我们的代码可在\ url {https://github.com/hncszyq/tianchi_challenge}中找到。
Robustness of deep neural networks (DNNs) to malicious perturbations is a hot topic in trustworthy AI. Existing techniques obtain robust models given fixed datasets, either by modifying model structures, or by optimizing the process of inference or training. While significant improvements have been made, the possibility of constructing a high-quality dataset for model robustness remain unexplored. Follow the campaign of data-centric AI launched by Andrew Ng, we propose a novel algorithm for dataset enhancement that works well for many existing DNN models to improve robustness. Transferable adversarial examples and 14 kinds of common corruptions are included in our optimized dataset. In the data-centric robust learning competition hosted by Alibaba Group and Tsinghua University, our algorithm came third out of more than 3000 competitors in the first stage while we ranked fourth in the second stage. Our code is available at \url{https://github.com/hncszyq/tianchi_challenge}.