知识指导的多标签少数图像识别

论文标题

知识指导的多标签少数图像识别

Knowledge-Guided Multi-Label Few-Shot Learning for General Image Recognition

论文作者

Chen, Tianshui, Lin, Liang, Chen, Riquan, Hui, Xiaolu, Wu, Hefeng

论文摘要

识别图像的多个标签是一项实用但又具有挑战性的任务，通过搜索语义区域和利用标签依赖性，取得了显着的进步。但是，当前的作品利用RNN/LSTM隐式捕获顺序区域/标签依赖性，该依赖性无法完全探索语义区域/标签之间的相互作用，并且不会明确整合标签共发生。此外，这些作品需要大量的每个类别的培训样本，并且它们无法概括为有限样本的新型类别。为了解决这些问题，我们提出了一个知识引导的图形路由（KGGR）框架，该框架统一了与深神经网络的统计标签相关性的先验知识。该框架利用了先验知识来指导不同类别之间的自适应信息传播，以促进多标签分析并减少培训样本的依赖性。具体而言，它首先构建一个结构化的知识图，以基于统计标签共发生关联不同的标签。然后，它引入了标签语义来指导学习语义特定的特征以初始化图形，并利用图形传播网络来探索图形节点交互，从而使学习上下文化的图像特征表示。此外，我们使用相应标签的分类器权重初始化每个图节点，并应用另一个传播网络通过图传输节点消息。这样，它可以促进利用相关标签的信息来帮助培训更好的分类器。我们对传统的多标签图像识别（MLR）和多标签少量学习（ML-FSL）任务进行了广泛的实验，并表明我们的KGGR框架的表现优于公共基准标上的相当大的边缘。

Recognizing multiple labels of an image is a practical yet challenging task, and remarkable progress has been achieved by searching for semantic regions and exploiting label dependencies. However, current works utilize RNN/LSTM to implicitly capture sequential region/label dependencies, which cannot fully explore mutual interactions among the semantic regions/labels and do not explicitly integrate label co-occurrences. In addition, these works require large amounts of training samples for each category, and they are unable to generalize to novel categories with limited samples. To address these issues, we propose a knowledge-guided graph routing (KGGR) framework, which unifies prior knowledge of statistical label correlations with deep neural networks. The framework exploits prior knowledge to guide adaptive information propagation among different categories to facilitate multi-label analysis and reduce the dependency of training samples. Specifically, it first builds a structured knowledge graph to correlate different labels based on statistical label co-occurrence. Then, it introduces the label semantics to guide learning semantic-specific features to initialize the graph, and it exploits a graph propagation network to explore graph node interactions, enabling learning contextualized image feature representations. Moreover, we initialize each graph node with the classifier weights for the corresponding label and apply another propagation network to transfer node messages through the graph. In this way, it can facilitate exploiting the information of correlated labels to help train better classifiers. We conduct extensive experiments on the traditional multi-label image recognition (MLR) and multi-label few-shot learning (ML-FSL) tasks and show that our KGGR framework outperforms the current state-of-the-art methods by sizable margins on the public benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题