将人类知识纳入数据嵌入以提高模式的意义和解释性

论文标题

将人类知识纳入数据嵌入以提高模式的意义和解释性

Incorporation of Human Knowledge into Data Embeddings to Improve Pattern Significance and Interpretability

论文作者

Li, Jie, Zhou, Chun-qi

论文摘要

嵌入是分析多维数据的常见技术。但是，嵌入投影不能总是形成预示数据模式的重要和可解释的视觉结构。我们提出了一种将人类知识纳入数据嵌入的方法，以提高模式意义和解释性。核心思想是（1）将隐性人类知识作为显式样本标签，（2）在嵌入网络中添加分类损失以编码样本的类。该方法将同一类的样本拉开，具有相似的数据特征在投影中更接近，从而导致更紧凑（显着）和一致性（可解释的）视觉结构。我们给出一个具有自定义分类损失的嵌入式网络，以实现该想法并将网络集成到可视化系统中，以形成支持灵活的类创建和模式探索的工作流程。在案例研究中，在开放数据集上发现的模式，受试者在用户研究中的表现以及定量实验结果说明了方法的一般可用性和有效性。

Embedding is a common technique for analyzing multi-dimensional data. However, the embedding projection cannot always form significant and interpretable visual structures that foreshadow underlying data patterns. We propose an approach that incorporates human knowledge into data embeddings to improve pattern significance and interpretability. The core idea is (1) externalizing tacit human knowledge as explicit sample labels and (2) adding a classification loss in the embedding network to encode samples' classes. The approach pulls samples of the same class with similar data features closer in the projection, leading to more compact (significant) and class-consistent (interpretable) visual structures. We give an embedding network with a customized classification loss to implement the idea and integrate the network into a visualization system to form a workflow that supports flexible class creation and pattern exploration. Patterns found on open datasets in case studies, subjects' performance in a user study, and quantitative experiment results illustrate the general usability and effectiveness of the approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题