论文标题
在隐藏的马尔可夫模型中验证pufferfish隐私
Verifying Pufferfish Privacy in Hidden Markov Models
论文作者
论文摘要
Pufferfish是设计和分析隐私机制的贝叶斯隐私框架。它通过允许在隐私分析中明确知识来完善差异隐私,即数据隐私的当前黄金标准。通过这些隐私框架,文献中已经开发了许多隐私机制。在实践中,通常需要修改或调整到特定应用程序的隐私机制。在不同情况下,必须重新评估他们的隐私风险。此外,计算设备仅通过浮点计算近似连续的噪声,这本质上是离散的。因此,隐私证明可能是复杂的,容易出现错误。对于平均数据策展人来说,这种乏味的任务可能是负担重的。在本文中,我们提出了一种自动验证技术,以实现Pufferfish隐私。我们使用隐藏的Markov模型来指定和分析离散的Pufferfish隐私机制。我们表明,隐藏的马尔可夫模型中的pufferfish验证问题是NP-HARD。使用满意度模型理论求解器,我们提出了一种算法来分析隐私要求。我们在称为Faier的典型工具中实现算法,并提出了一些案例研究。出乎意料的是,我们的案例研究表明,固有的隐私机制的幼稚离散化常常失败,这是Faier产生的反例见证的。在离散的\ emph {高于阈值}中,我们表明它绝对没有隐私。最后,我们将我们的方法与基于测试的方法进行了对几个案例研究的比较,并表明我们的验证技术可以与基于测试的方法相结合,以(i)有效认证反示例,以及(ii)为隐私预算$ε$提供更好的下限。
Pufferfish is a Bayesian privacy framework for designing and analyzing privacy mechanisms. It refines differential privacy, the current gold standard in data privacy, by allowing explicit prior knowledge in privacy analysis. Through these privacy frameworks, a number of privacy mechanisms have been developed in literature. In practice, privacy mechanisms often need be modified or adjusted to specific applications. Their privacy risks have to be re-evaluated for different circumstances. Moreover, computing devices only approximate continuous noises through floating-point computation, which is discrete in nature. Privacy proofs can thus be complicated and prone to errors. Such tedious tasks can be burdensome to average data curators. In this paper, we propose an automatic verification technique for Pufferfish privacy. We use hidden Markov models to specify and analyze discretized Pufferfish privacy mechanisms. We show that the Pufferfish verification problem in hidden Markov models is NP-hard. Using Satisfiability Modulo Theories solvers, we propose an algorithm to analyze privacy requirements. We implement our algorithm in a prototypical tool called FAIER, and present several case studies. Surprisingly, our case studies show that naïve discretization of well-established privacy mechanisms often fail, witnessed by counterexamples generated by FAIER. In discretized \emph{Above Threshold}, we show that it results in absolutely no privacy. Finally, we compare our approach with testing based approach on several case studies, and show that our verification technique can be combined with testing based approach for the purpose of (i) efficiently certifying counterexamples and (ii) obtaining a better lower bound for the privacy budget $ε$.