论文标题

与NLI的合规性检查:隐私政策与法规

Compliance Checking with NLI: Privacy Policies vs. Regulations

论文作者

Rabinia, Amin, Nygaard, Zane

论文摘要

隐私政策是一份文件,该文件说明公司打算如何处理和管理客户的个人数据。这些隐私政策引起的问题之一是它们的内容可能违反数据隐私法规。由于存在的大量隐私政策数量,因此检查所有这些的唯一现实方法是通过自动方法。在这项工作中,我们使用自然语言推断(NLI)技术将隐私法规与选择大型公司的隐私政策部分进行比较。我们的NLI模型在其注意机制中使用了预训练的嵌入以及Bilstm。我们尝试了两个版本的模型:一个对斯坦福大学自然语言推断(SNLI)的培训,第二个是在多流派自然语言推断(MNLI)数据集上进行的。我们发现,在SNLI培训的模型上,我们的测试准确性更高,但是当实际上在现实世界隐私政策上执行NLI任务时,该模型对MNLI进行了培训,并且表现更好。

A privacy policy is a document that states how a company intends to handle and manage their customers' personal data. One of the problems that arises with these privacy policies is that their content might violate data privacy regulations. Because of the enormous number of privacy policies that exist, the only realistic way to check for legal inconsistencies in all of them is through an automated method. In this work, we use Natural Language Inference (NLI) techniques to compare privacy regulations against sections of privacy policies from a selection of large companies. Our NLI model uses pre-trained embeddings, along with BiLSTM in its attention mechanism. We tried two versions of our model: one that was trained on the Stanford Natural Language Inference (SNLI) and the second on the Multi-Genre Natural Language Inference (MNLI) dataset. We found that our test accuracy was higher on our model trained on the SNLI, but when actually doing NLI tasks on real world privacy policies, the model trained on MNLI generalized and performed much better.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源