道德模仿：大型语言模型产生了针对政治身份量身定制的道德合理化

论文标题

道德模仿：大型语言模型产生了针对政治身份量身定制的道德合理化

Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored to Political Identity

论文作者

Simmons, Gabriel

论文摘要

大型语言模型（LLMS）在产生流利的文本方面表现出了令人印象深刻的能力，并且倾向于再现不良的社会偏见。这项研究调查了LLMS是否复制了与美国政治群体相关的道德偏见，此处称为道德模仿的实例。在GPT-3/3.5和基于变压器的LLM的OPT家族中探讨了这一假设。使用道德基础理论中的工具，这表明这些LLM确实是道德模仿。当带有自由主义或保守的政治身份时，这些模型产生了反映相应道德偏见的文本。这项研究还探讨了道德模仿与模型大小与人类和LLM道德词使用之间的相似性之间的关系。

Large Language Models (LLMs) have demonstrated impressive capabilities in generating fluent text, as well as tendencies to reproduce undesirable social biases. This study investigates whether LLMs reproduce the moral biases associated with political groups in the United States, an instance of a broader capability herein termed moral mimicry. This hypothesis is explored in the GPT-3/3.5 and OPT families of Transformer-based LLMs. Using tools from Moral Foundations Theory, it is shown that these LLMs are indeed moral mimics. When prompted with a liberal or conservative political identity, the models generate text reflecting corresponding moral biases. This study also explores the relationship between moral mimicry and model size, and similarity between human and LLM moral word use.

下载PDF全文

下载文献需遵守相关版权规定

论文标题