基于谐波增强和最大重量集团的强大ENF估计

论文标题

基于谐波增强和最大重量集团的强大ENF估计

Robust ENF Estimation Based on Harmonic Enhancement and Maximum Weight Clique

论文作者

Hua, Guang, Liao, Han, Zhang, Haijian, Ye, Dengpan, Ma, Jiayi

论文摘要

我们提出了一个从现实世界录音中提取强大的电网频率（ENF）的框架，其中具有多色调ENF谐波增强和基于图的最佳谐波选择。具体而言，我们首先将最近开发的单色调ENF信号增强方法扩展到多色调方案，并提出一种谐波鲁棒过滤算法（HRFA）。它可以分别增强每个谐波组件而无需跨组件干扰，从而进一步减轻了不良噪声和音频含量对较弱的ENF信号的影响。此外，考虑到即使在增强，令人不安的而不是促进ENF估计后，某些谐波组件也可能严重损坏，我们提出了一种基于图的谐波选择算法（GHSA），该算法发现谐波组件的最佳组合以进行更准确的ENF估计。值得注意的是，谐波选择问题被等效地作为图理论中的最大权重集团（MWC）问题，并且在GHSA中采用了Bron-Kerbosch算法（BKA）。使用增强且最佳选择的谐波组件，合并了现有的最大似然估计器（MLE）和加权MLE（WMLE），以产生最终的ENF估计结果。使用合成信号和我们的enf-whu数据集对所提出的框架进行了广泛的评估，该数据集由$ 130 $现实世界的录音组成，证明了从现实的单一单调和多音调竞争者中从现实的嘈杂观测中提取ENF的能力大大提高了ENF。这项工作进一步提高了ENF作为实际情况下的法医标准的适用性。

We present a framework for robust electric network frequency (ENF) extraction from real-world audio recordings, featuring multi-tone ENF harmonic enhancement and graph-based optimal harmonic selection. Specifically, We first extend the recently developed single-tone ENF signal enhancement method to the multi-tone scenario and propose a harmonic robust filtering algorithm (HRFA). It can respectively enhance each harmonic component without cross-component interference, thus further alleviating the effects of unwanted noise and audio content on the much weaker ENF signal. In addition, considering the fact that some harmonic components could be severely corrupted even after enhancement, disturbing rather than facilitating ENF estimation, we propose a graph-based harmonic selection algorithm (GHSA), which finds the optimal combination of harmonic components for more accurate ENF estimation. Noticeably, the harmonic selection problem is equivalently formulated as a maximum weight clique (MWC) problem in graph theory, and the Bron-Kerbosch algorithm (BKA) is adopted in the GHSA. With the enhanced and optimally selected harmonic components, both the existing maximum likelihood estimator (MLE) and weighted MLE (WMLE) are incorporated to yield the final ENF estimation results. The proposed framework is extensively evaluated using both synthetic signals and our ENF-WHU dataset consisting of $130$ real-world audio recordings, demonstrating substantially improved capability of extracting the ENF from realistically noisy observations over the existing single- and multi-tone competitors. This work further improves the applicability of the ENF as a forensic criterion in real-world situations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题