Multimodal graph inference network for scene graph generation

Author(s):  
Jingwen Duan ◽  
Weidong Min ◽  
Deyu Lin ◽  
Jianfeng Xu ◽  
Xin Xiong
2020 ◽  
Vol 14 (7) ◽  
pp. 546-553
Author(s):  
Zhenxing Zheng ◽  
Zhendong Li ◽  
Gaoyun An ◽  
Songhe Feng
Keyword(s):  

2021 ◽  
Vol 10 (7) ◽  
pp. 488
Author(s):  
Peng Li ◽  
Dezheng Zhang ◽  
Aziguli Wulamu ◽  
Xin Liu ◽  
Peng Chen

A deep understanding of our visual world is more than an isolated perception on a series of objects, and the relationships between them also contain rich semantic information. Especially for those satellite remote sensing images, the span is so large that the various objects are always of different sizes and complex spatial compositions. Therefore, the recognition of semantic relations is conducive to strengthen the understanding of remote sensing scenes. In this paper, we propose a novel multi-scale semantic fusion network (MSFN). In this framework, dilated convolution is introduced into a graph convolutional network (GCN) based on an attentional mechanism to fuse and refine multi-scale semantic context, which is crucial to strengthen the cognitive ability of our model Besides, based on the mapping between visual features and semantic embeddings, we design a sparse relationship extraction module to remove meaningless connections among entities and improve the efficiency of scene graph generation. Meanwhile, to further promote the research of scene understanding in remote sensing field, this paper also proposes a remote sensing scene graph dataset (RSSGD). We carry out extensive experiments and the results show that our model significantly outperforms previous methods on scene graph generation. In addition, RSSGD effectively bridges the huge semantic gap between low-level perception and high-level cognition of remote sensing images.


Author(s):  
Yuyu Guo ◽  
Jingkuan Song ◽  
Lianli Gao ◽  
Heng Tao Shen
Keyword(s):  

2021 ◽  
Author(s):  
Zhichao Zhang ◽  
Junyu Dong ◽  
Qilu Zhao ◽  
Lin Qi ◽  
Shu Zhang
Keyword(s):  

Author(s):  
Yikang Li ◽  
Wanli Ouyang ◽  
Bolei Zhou ◽  
Jianping Shi ◽  
Chao Zhang ◽  
...  
Keyword(s):  

Author(s):  
Jing Yu ◽  
Yuan Chai ◽  
Yujing Wang ◽  
Yue Hu ◽  
Qi Wu

Scene graphs are semantic abstraction of images that encourage visual understanding and reasoning. However, the performance of Scene Graph Generation (SGG) is unsatisfactory when faced with biased data in real-world scenarios. Conventional debiasing research mainly studies from the view of balancing data distribution or learning unbiased models and representations, ignoring the correlations among the biased classes. In this work, we analyze this problem from a novel cognition perspective: automatically building a hierarchical cognitive structure from the biased predictions and navigating that hierarchy to locate the relationships, making the tail relationships receive more attention in a coarse-to-fine mode. To this end, we propose a novel debiasing Cognition Tree (CogTree) loss for unbiased SGG. We first build a cognitive structure CogTree to organize the relationships based on the prediction of a biased SGG model. The CogTree distinguishes remarkably different relationships at first and then focuses on a small portion of easily confused ones. Then, we propose a debiasing loss specially for this cognitive structure, which supports coarse-to-fine distinction for the correct relationships. The loss is model-agnostic and consistently boosting the performance of several state-of-the-art models. The code is available at: https://github.com/CYVincent/Scene-Graph-Transformer-CogTree.


Sign in / Sign up

Export Citation Format

Share Document