论文标题

天秤座:事件日志的高纯度匿名化,用于通过亚采样进行过程挖掘

Libra: High-Utility Anonymization of Event Logs for Process Mining via Subsampling

论文作者

Elkoumy, Gamal, Dumas, Marlon

论文摘要

工艺挖掘技术使分析师能够根据事件日志识别和评估过程改进机会。处理开采的常见障碍是事件日志可能包含未经同意就无法用于分析的私人信息。克服此障碍的一种方法是将事件日志匿名化,以便可以根据匿名的数字列出原始日志中的任何个人。差异隐私是一种提供此保证的匿名方法。差异化事件日志匿名技术试图产生与原始日志(高实用程序)一样相似的匿名日志,同时提供所需的隐私保证。现有事件日志匿名技术通过将噪声注入日志中的迹线(例如,重复,扰动或过滤一些迹线)来运行。关于差异隐私的最新工作表明,可以通过在注入噪声之前应用子采样来实现更好的隐私性权衡权衡。换句话说,亚采样会放大隐私。本文提出了一种称为libra的事件日志匿名方法,该方法利用了这一观察结果。天秤座从日志中提取多个轨迹样本,独立地注入噪声,从每个样本中保留统计相关的痕迹,并组成样品以产生差异的私有日志。经验评估表明,所提出的方法导致相对于现有基准的等效隐私保证的效用较高。

Process mining techniques enable analysts to identify and assess process improvement opportunities based on event logs. A common roadblock to process mining is that event logs may contain private information that cannot be used for analysis without consent. An approach to overcome this roadblock is to anonymize the event log so that no individual represented in the original log can be singled out based on the anonymized one. Differential privacy is an anonymization approach that provides this guarantee. A differentially private event log anonymization technique seeks to produce an anonymized log that is as similar as possible to the original one (high utility) while providing a required privacy guarantee. Existing event log anonymization techniques operate by injecting noise into the traces in the log (e.g., duplicating, perturbing, or filtering out some traces). Recent work on differential privacy has shown that a better privacy-utility tradeoff can be achieved by applying subsampling prior to noise injection. In other words, subsampling amplifies privacy. This paper proposes an event log anonymization approach called Libra that exploits this observation. Libra extracts multiple samples of traces from a log, independently injects noise, retains statistically relevant traces from each sample, and composes the samples to produce a differentially private log. An empirical evaluation shows that the proposed approach leads to a considerably higher utility for equivalent privacy guarantees relative to existing baselines.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源