在自由意识隐私下重建敏感数据的分布

论文标题

在自由意识隐私下重建敏感数据的分布

Reconstruction of the distribution of sensitive data under free-will privacy

论文作者

ElSalamouny, Ehab, Palamidessi, Catuscia

论文摘要

当地的隐私机制，例如K-RR，Rappor和Geo-Indistististancerability，由于可以在用户最终实现混淆，因此避免了受信任的第三方的需要，因此变得非常受欢迎。另一个重要的优点是，每个数据点都独立于其他数据进行消毒，因此，不同的用户可能会根据其隐私要求使用不同级别的混淆，或者他们甚至可以根据交易数据的服务使用完全不同的机制。在这种情况下，一个具有挑战性的要求是在用户敏感的数据上构建原始分布，从其嘈杂的版本中构造。现有技术只能在每个混淆模式和相应的嘈杂数据子集上分别估计该分布。但是越小的子集，估计值越精确。在本文中，我们研究了如何在结合局部隐私机制时避免分数的问题，从而恢复了最佳效用。我们专注于原始分布的估计，以及估计它的两种主要方法：矩阵内方法和迭代贝叶斯更新。我们考虑了各种局部隐私机制组合的情况，并比较了两种方法的灵活性和性能。

The local privacy mechanisms, such as k-RR, RAPPOR, and the geo-indistinguishability ones, have become quite popular thanks to the fact that the obfuscation can be effectuated at the users end, thus avoiding the need of a trusted third party. Another important advantage is that each data point is sanitized independently from the others, and therefore different users may use different levels of obfuscation depending on their privacy requirements, or they may even use entirely different mechanisms depending on the services they are trading their data for. A challenging requirement in this setting is to construct the original distribution on the users sensitive data from their noisy versions. Existing techniques can only estimate that distribution separately on each obfuscation schema and corresponding noisy data subset. But the smaller are the subsets, the more imprecise the estimations are. In this paper we study how to avoid the subsets-fractioning problem when combining local privacy mechanisms, thus recovering an optimal utility. We focus on the estimation of the original distribution, and on the two main methods to estimate it: the matrix-inversion method and the iterative Bayes update. We consider various cases of combination of local privacy mechanisms, and compare the flexibility and the performance of the two methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题