论文标题

机器中的幽灵具有美国口音:GPT-3中的价值冲突

The Ghost in the Machine has an American accent: value conflict in GPT-3

论文作者

Johnson, Rebecca L, Pistilli, Giada, Menédez-González, Natalia, Duran, Leslye Denisse Dias, Panai, Enrico, Kalpokiene, Julija, Bertulfo, Donald Jay

论文摘要

在大语言模型的背景下,一致性问题必须考虑到我们世界中人类价值的多个价值观。尽管世界文化之间存在许多共鸣和重叠的价值,但也有许多相互冲突但同样有效的价值观。重要的是要观察模型表现出哪些文化价值,尤其是当输入提示和生成的输出之间存在价值冲突时。我们讨论语言和文化价值的共同创造如何影响大语言模型(LLMS)。我们探讨了GPT-3的培训数据的构成,并将其与世界语言和互联网访问人口统计以及某些民族国家中主要价值的统计概况进行了比较。我们用代表几种语言和国家的一系列价值丰富的文本对GPT-3进行了强调。包括一些具有正交价值为主导的美国公众舆论的价值观的人,包括世界价值调查报告。我们观察到嵌入输入文本中的值在生成的输出中突变时,并指出了这些相互冲突的值与报道的主要美国值更一致。我们对这些结果的讨论使用道德价值多元化(MVP)镜头,以更好地理解这些价值突变。最后,我们为我们的工作如何为该领域的其他当前工作做出贡献提供了建议。

The alignment problem in the context of large language models must consider the plurality of human values in our world. Whilst there are many resonant and overlapping values amongst the world's cultures, there are also many conflicting, yet equally valid, values. It is important to observe which cultural values a model exhibits, particularly when there is a value conflict between input prompts and generated outputs. We discuss how the co-creation of language and cultural value impacts large language models (LLMs). We explore the constitution of the training data for GPT-3 and compare that to the world's language and internet access demographics, as well as to reported statistical profiles of dominant values in some Nation-states. We stress tested GPT-3 with a range of value-rich texts representing several languages and nations; including some with values orthogonal to dominant US public opinion as reported by the World Values Survey. We observed when values embedded in the input text were mutated in the generated outputs and noted when these conflicting values were more aligned with reported dominant US values. Our discussion of these results uses a moral value pluralism (MVP) lens to better understand these value mutations. Finally, we provide recommendations for how our work may contribute to other current work in the field.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源