值得信赖的社会偏见测量

论文标题

值得信赖的社会偏见测量

Trustworthy Social Bias Measurement

论文作者

Bommasani, Rishi, Liang, Percy

论文摘要

我们如何设计我们信任的社会偏见的度量？尽管先前的工作已经引入了几项措施，但没有任何措施获得广泛的信任：相反，越来越多的证据表明，我们应该不信任这些措施。在这项工作中，我们设计了基于跨学科模型理论保证信任的偏见衡量标准。为了应对NLP中经常模糊的社会偏见处理，我们明确定义了社会偏见，这是基于社会科学研究所汲取的原则。我们通过提出一般的偏见测量框架divdist来实现5种混凝土偏置测量方法来实现我们的定义。为了验证我们的措施，我们提出了一个具有8个测试标准的严格测试方案（例如，预测有效性：措施是否可以预测美国就业的偏见？）。通过我们的测试，我们证明了大量证据来信任我们的措施，表明他们克服了先前措施中存在的概念，技术和经验缺陷。

How do we design measures of social bias that we trust? While prior work has introduced several measures, no measure has gained widespread trust: instead, mounting evidence argues we should distrust these measures. In this work, we design bias measures that warrant trust based on the cross-disciplinary theory of measurement modeling. To combat the frequently fuzzy treatment of social bias in NLP, we explicitly define social bias, grounded in principles drawn from social science research. We operationalize our definition by proposing a general bias measurement framework DivDist, which we use to instantiate 5 concrete bias measures. To validate our measures, we propose a rigorous testing protocol with 8 testing criteria (e.g. predictive validity: do measures predict biases in US employment?). Through our testing, we demonstrate considerable evidence to trust our measures, showing they overcome conceptual, technical, and empirical deficiencies present in prior measures.

下载PDF全文

下载文献需遵守相关版权规定

论文标题