论文标题
工业环境中的声音事件分类:管道泄漏检测用例
Sound Event Classification in an Industrial Environment: Pipe Leakage Detection Use Case
论文作者
论文摘要
在这项工作中,提出了多阶段的机器学习(ML)管道,以用于工业环境中的管道泄漏检测。与其他工业和城市环境相反,所研究的环境包括许多干扰背景噪声,使泄漏的识别变得复杂。此外,恶劣的环境条件限制了收集的数据量,并施加了低复杂算法的使用。为了解决环境的约束,开发的ML管道应用了多个步骤,每个步骤都解决了环境的挑战。提出的ML管道首先通过特征选择技术降低数据维度,然后通过提取基于时间的特征来结合时间相关性。最终的特征被馈送到低复杂性的支持向量机(SVM)中,可以很好地概括到少量数据。在两个数据集上执行了广泛的实验程序,一个数据集具有背景工业噪声,一个无需评估拟议管道的有效性。 SVM超参数和管道步骤特定的参数是实验程序的一部分。从数据集获得的具有工业噪声和泄漏的最佳模型被应用于数据集,没有噪声,没有泄漏来测试其概括性。结果表明,该模型以99 \%的精度和0.93的F1评分产生了出色的结果,相应的数据集为0.93和0.9。
In this work, a multi-stage Machine Learning (ML) pipeline is proposed for pipe leakage detection in an industrial environment. As opposed to other industrial and urban environments, the environment under study includes many interfering background noises, complicating the identification of leaks. Furthermore, the harsh environmental conditions limit the amount of data collected and impose the use of low-complexity algorithms. To address the environment's constraints, the developed ML pipeline applies multiple steps, each addressing the environment's challenges. The proposed ML pipeline first reduces the data dimensionality by feature selection techniques and then incorporates time correlations by extracting time-based features. The resultant features are fed to a Support Vector Machine (SVM) of low-complexity that generalizes well to a small amount of data. An extensive experimental procedure was carried out on two datasets, one with background industrial noise and one without, to evaluate the validity of the proposed pipeline. The SVM hyper-parameters and parameters specific to the pipeline steps were tuned as part of the experimental procedure. The best models obtained from the dataset with industrial noise and leaks were applied to datasets without noise and with and without leaks to test their generalizability. The results show that the model produces excellent results with 99\% accuracy and an F1-score of 0.93 and 0.9 for the respective datasets.