Dcase 2022挑战中的低复杂性声学场景分类

论文标题

Dcase 2022挑战中的低复杂性声学场景分类

Low-complexity acoustic scene classification in DCASE 2022 Challenge

论文作者

Martín-Morató, Irene, Paissan, Francesco, Ancilotto, Alberto, Heittola, Toni, Mesaros, Annamaria, Farella, Elisabetta, Brutti, Alessio, Virtanen, Tuomas

论文摘要

本文介绍了Dcase 2022挑战中低复杂性声学场景分类任务的分析。该任务是从前几年开始的延续，但是低复杂性要求已更改为以下内容：允许参数的最大数量，包括零值的参数为128 K，使用INT8数值格式表示参数；推理时间的最大多收益操作数量为3000万。提供的基线系统是一种卷积神经网络，采用了参数后培训量化，导致46.5 K参数和2923万个多重和积累操作（MMACS）。它在评估数据上的性能为44.2％的精度和1.532日志损失。相比之下，挑战中的最高系统的准确性为59.6％，对数损失为1.091，具有121 K参数和28 mmaC。该任务从19个不同的团队中收到了48次提交，其中大多数表现优于基线系统。

This paper presents an analysis of the Low-Complexity Acoustic Scene Classification task in DCASE 2022 Challenge. The task was a continuation from the previous years, but the low-complexity requirements were changed to the following: the maximum number of allowed parameters, including the zero-valued ones, was 128 K, with parameters being represented using INT8 numerical format; and the maximum number of multiply-accumulate operations at inference time was 30 million. The provided baseline system is a convolutional neural network which employs post-training quantization of parameters, resulting in 46.5 K parameters, and 29.23 million multiply-and-accumulate operations (MMACs). Its performance on the evaluation data is 44.2% accuracy and 1.532 log-loss. In comparison, the top system in the challenge obtained an accuracy of 59.6% and a log loss of 1.091, having 121 K parameters and 28 MMACs. The task received 48 submissions from 19 different teams, most of which outperformed the baseline system.

下载PDF全文

下载文献需遵守相关版权规定

论文标题