论文标题
训练声音事件在异质数据集上检测
Training Sound Event Detection On A Heterogeneous Dataset
论文作者
论文摘要
在异质数据集上训练声音事件检测算法,包括录制的和合成的音景,可以具有各种标记粒度,这是一项非平凡的任务,可以导致需要多种技术选择的系统。这些技术选择通常是从一个系统传递到另一个系统而不会受到质疑的。我们建议对DCASE 2020任务4声音事件检测基线进行详细分析,以涉及几个方面,例如用于训练的数据类型,均值老师的参数或在生成合成音景时应用的转换。一些通常用作默认值的参数显示为亚最佳。
Training a sound event detection algorithm on a heterogeneous dataset including both recorded and synthetic soundscapes that can have various labeling granularity is a non-trivial task that can lead to systems requiring several technical choices. These technical choices are often passed from one system to another without being questioned. We propose to perform a detailed analysis of DCASE 2020 task 4 sound event detection baseline with regards to several aspects such as the type of data used for training, the parameters of the mean-teacher or the transformations applied while generating the synthetic soundscapes. Some of the parameters that are usually used as default are shown to be sub-optimal.