量化神经语言模型的记忆

论文标题

量化神经语言模型的记忆

Quantifying Memorization Across Neural Language Models

论文作者

Carlini, Nicholas, Ippolito, Daphne, Jagielski, Matthew, Lee, Katherine, Tramer, Florian, Zhang, Chiyuan

论文摘要

大型语言模型（LMS）已被证明可以记住其培训数据的一部分，并且在适当提示时，他们将逐字排放记忆的培训数据。这是不希望的，因为记忆侵犯了隐私（揭示用户数据），降低效用（重复易于修复的文本通常是低质量的），并且会伤害公平性（有些文本对其他文本进行了记忆）。我们描述了三种对数线性关系，以量化LMS发射记忆训练数据的程度。随着我们的增加（1）模型的容量，（2）复制示例的次数，以及（3）用于提示模型的上下文令牌的数量。令人惊讶的是，我们发现在跨模型家族概括这些结果时，情况变得更加复杂。总体而言，我们发现LMS中的记忆比以前认为的更为普遍，并且随着模型的扩展，至少没有主动缓解，随着模型的扩展，LMS的记忆可能会变得更糟。

Large language models (LMs) have been shown to memorize parts of their training data, and when prompted appropriately, they will emit the memorized training data verbatim. This is undesirable because memorization violates privacy (exposing user data), degrades utility (repeated easy-to-memorize text is often low quality), and hurts fairness (some texts are memorized over others). We describe three log-linear relationships that quantify the degree to which LMs emit memorized training data. Memorization significantly grows as we increase (1) the capacity of a model, (2) the number of times an example has been duplicated, and (3) the number of tokens of context used to prompt the model. Surprisingly, we find the situation becomes more complicated when generalizing these results across model families. On the whole, we find that memorization in LMs is more prevalent than previously believed and will likely get worse as models continues to scale, at least without active mitigations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题