论文标题
探索场景图生成的关系标签中的层次结构
Exploring the Hierarchy in Relation Labels for Scene Graph Generation
论文作者
论文摘要
通过分配每个关系一个单个标签,当前方法将关系检测作为分类问题提出。在此公式下,谓词类别被视为完全不同的类别。但是,与不同类具有明确边界的对象标签不同,谓词通常在其语义含义中有重叠。例如,sit \ _on和架子\ _on在垂直关系中具有共同的含义,但垂直放置这两个对象的细节不同。为了利用谓词类别的固有结构,我们建议首先构建语言层次结构,然后利用层次结构引导的特征学习(HGFL)策略来学习粗粒级别和细粒级别的更好的区域特征。此外,我们还建议层次结构引导模块(HGM)利用粗粒度水平来指导学习细粒水平的特征。实验表明,提出的简单但有效的方法可以在不同数据集中的场景图生成的任务上,将几个最先进的基线提高了几个最先进的基线(最高$ 33 \%$相对增益)。
By assigning each relationship a single label, current approaches formulate the relationship detection as a classification problem. Under this formulation, predicate categories are treated as completely different classes. However, different from the object labels where different classes have explicit boundaries, predicates usually have overlaps in their semantic meanings. For example, sit\_on and stand\_on have common meanings in vertical relationships but different details of how these two objects are vertically placed. In order to leverage the inherent structures of the predicate categories, we propose to first build the language hierarchy and then utilize the Hierarchy Guided Feature Learning (HGFL) strategy to learn better region features of both the coarse-grained level and the fine-grained level. Besides, we also propose the Hierarchy Guided Module (HGM) to utilize the coarse-grained level to guide the learning of fine-grained level features. Experiments show that the proposed simple yet effective method can improve several state-of-the-art baselines by a large margin (up to $33\%$ relative gain) in terms of Recall@50 on the task of Scene Graph Generation in different datasets.