分布式表示形式的几何形状，以更好地对齐，减弱偏见和改善的解释性

论文标题

分布式表示形式的几何形状，以更好地对齐，减弱偏见和改善的解释性

The Geometry of Distributed Representations for Better Alignment, Attenuated Bias, and Improved Interpretability

论文作者

Dev, Sunipa

论文摘要

单词，文本，图像，知识图和其他结构化数据的高维表示通常用于机器学习和数据挖掘的不同范式中。这些表示形式具有不同程度的可解释性，有效的分布式表示形式以功能损失为维度映射为代价。这意味着在这些嵌入空间中捕获概念的方式存在混淆。在许多表示和任务中都可以看到它的效果，其中一个特别有问题的是语言表示，从潜在的数据中学到的社会偏见被捕获并在未知的维度和子空间中捕获和封闭。结果，这些表示形式使无效的关联（例如不同的种族及其与好与坏的极地概念的关联）是由代表制成和传播的，从而在使用它们的不同任务中导致了不公平的结果。这项工作解决了与此类表示形式的透明度和解释性有关的一些问题。主要重点是对语言表示中社会偏见的关联的检测，量化和缓解。

High-dimensional representations for words, text, images, knowledge graphs and other structured data are commonly used in different paradigms of machine learning and data mining. These representations have different degrees of interpretability, with efficient distributed representations coming at the cost of the loss of feature to dimension mapping. This implies that there is obfuscation in the way concepts are captured in these embedding spaces. Its effects are seen in many representations and tasks, one particularly problematic one being in language representations where the societal biases, learned from underlying data, are captured and occluded in unknown dimensions and subspaces. As a result, invalid associations (such as different races and their association with a polar notion of good versus bad) are made and propagated by the representations, leading to unfair outcomes in different tasks where they are used. This work addresses some of these problems pertaining to the transparency and interpretability of such representations. A primary focus is the detection, quantification, and mitigation of socially biased associations in language representation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题