种子：声音事件通过证据不确定性提前检测

论文标题

种子：声音事件通过证据不确定性提前检测

SEED: Sound Event Early Detection via Evidential Uncertainty

论文作者

Zhao, Xujiang, Zhang, Xuchao, Cheng, Wei, Yu, Wenchao, Chen, Yuncong, Chen, Haifeng, Chen, Feng

论文摘要

声音事件早期检测（种子）是识别声学环境和音景的重要任务。但是，大多数现有方法都集中在离线声音事件检测上，这遭受了早期事件检测的过度自信问题，通常会产生不可靠的结果。为了解决该问题，我们提出了一种新型的复音证据神经网络（PENET），以模拟使用Beta分布的类概率的证据不确定性。具体而言，我们使用beta分布来模拟类概率的分布，而证据不确定性丰富了不确定性表示，证据信息在可靠的预测中起着核心作用。为了进一步提高事件检测性能，我们设计了回溯推理方法，该方法同时利用了正在进行的事件的前向音频功能。与最先进的方法相比，所提出的方法可以同时提高13.0 \％和3.8％的F1分数，这表明所提出的方法可以同时提高13.0 \％和3.8％。

Sound Event Early Detection (SEED) is an essential task in recognizing the acoustic environments and soundscapes. However, most of the existing methods focus on the offline sound event detection, which suffers from the over-confidence issue of early-stage event detection and usually yield unreliable results. To solve the problem, we propose a novel Polyphonic Evidential Neural Network (PENet) to model the evidential uncertainty of the class probability with Beta distribution. Specifically, we use a Beta distribution to model the distribution of class probabilities, and the evidential uncertainty enriches uncertainty representation with evidence information, which plays a central role in reliable prediction. To further improve the event detection performance, we design the backtrack inference method that utilizes both the forward and backward audio features of an ongoing event. Experiments on the DESED database show that the proposed method can simultaneously improve 13.0\% and 3.8\% in time delay and detection F1 score compared to the state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题