论文标题
使用知识图的视觉传输学习调查
A Survey on Visual Transfer Learning using Knowledge Graphs
论文作者
论文摘要
计算机视觉的最新方法利用了深度学习方法,因为培训和测试域遵循相同的基础数据分布,它们的表现很好。但是,已经表明,在现实世界中使用这些方法时发生的图像中发生的较小变化可能会导致不可预测的错误。转移学习是试图防止这些错误的机器学习领域。特别是,使用语言嵌入或知识图(KGS)编码的辅助知识增强图像数据的方法已在近年来取得了令人鼓舞的结果。这项调查着重于使用kgs的视觉传输学习方法。 KGS可以在基础图结构架构中或基于向量的知识图嵌入中代表辅助知识。为了使读者能够借助特定的kg-dl配置解决视觉传递学习问题,我们以描述各种表达式的相关建模结构的描述,例如定向标记的图形,超透明图和超级拼合图。我们解释了特征提取器的概念,同时特别指的是视觉和语义特征。我们提供了知识图嵌入方法的广泛概述,并描述了几种合适的联合训练目标,可将其与高维视觉嵌入结合在一起。主要部分介绍了如何将kg与DL管道结合在一起的四个不同类别:1)知识图作为审阅者; 2)知识图作为学员; 3)知识图作为培训师; 4)知识图作为同伴。为了帮助研究人员找到评估基准,我们概述了通用KGS以及一组图像处理数据集和基准,包括各种类型的辅助知识。最后,我们总结了相关的调查,并就未来研究的挑战和开放问题提供了前景。
Recent approaches of computer vision utilize deep learning methods as they perform quite well if training and testing domains follow the same underlying data distribution. However, it has been shown that minor variations in the images that occur when using these methods in the real world can lead to unpredictable errors. Transfer learning is the area of machine learning that tries to prevent these errors. Especially, approaches that augment image data using auxiliary knowledge encoded in language embeddings or knowledge graphs (KGs) have achieved promising results in recent years. This survey focuses on visual transfer learning approaches using KGs. KGs can represent auxiliary knowledge either in an underlying graph-structured schema or in a vector-based knowledge graph embedding. Intending to enable the reader to solve visual transfer learning problems with the help of specific KG-DL configurations we start with a description of relevant modeling structures of a KG of various expressions, such as directed labeled graphs, hypergraphs, and hyper-relational graphs. We explain the notion of feature extractor, while specifically referring to visual and semantic features. We provide a broad overview of knowledge graph embedding methods and describe several joint training objectives suitable to combine them with high dimensional visual embeddings. The main section introduces four different categories on how a KG can be combined with a DL pipeline: 1) Knowledge Graph as a Reviewer; 2) Knowledge Graph as a Trainee; 3) Knowledge Graph as a Trainer; and 4) Knowledge Graph as a Peer. To help researchers find evaluation benchmarks, we provide an overview of generic KGs and a set of image processing datasets and benchmarks including various types of auxiliary knowledge. Last, we summarize related surveys and give an outlook about challenges and open issues for future research.