论文标题
Dcase 2022挑战中的低复杂性声学场景分类
Low-complexity acoustic scene classification in DCASE 2022 Challenge
论文作者
论文摘要
本文介绍了Dcase 2022挑战中低复杂性声学场景分类任务的分析。该任务是从前几年开始的延续,但是低复杂性要求已更改为以下内容:允许参数的最大数量,包括零值的参数为128 K,使用INT8数值格式表示参数;推理时间的最大多收益操作数量为3000万。提供的基线系统是一种卷积神经网络,采用了参数后培训量化,导致46.5 K参数和2923万个多重和积累操作(MMACS)。它在评估数据上的性能为44.2%的精度和1.532日志损失。相比之下,挑战中的最高系统的准确性为59.6%,对数损失为1.091,具有121 K参数和28 mmaC。该任务从19个不同的团队中收到了48次提交,其中大多数表现优于基线系统。
This paper presents an analysis of the Low-Complexity Acoustic Scene Classification task in DCASE 2022 Challenge. The task was a continuation from the previous years, but the low-complexity requirements were changed to the following: the maximum number of allowed parameters, including the zero-valued ones, was 128 K, with parameters being represented using INT8 numerical format; and the maximum number of multiply-accumulate operations at inference time was 30 million. The provided baseline system is a convolutional neural network which employs post-training quantization of parameters, resulting in 46.5 K parameters, and 29.23 million multiply-and-accumulate operations (MMACs). Its performance on the evaluation data is 44.2% accuracy and 1.532 log-loss. In comparison, the top system in the challenge obtained an accuracy of 59.6% and a log loss of 1.091, having 121 K parameters and 28 MMACs. The task received 48 submissions from 19 different teams, most of which outperformed the baseline system.