scholarly journals GraphSMOTE: Imbalanced Node Classification on Graphs with Graph Neural Networks

Author(s):  
Tianxiang Zhao ◽  
Xiang Zhang ◽  
Suhang Wang
2021 ◽  
Vol 6 (1) ◽  
Author(s):  
Hussain Hussain ◽  
Tomislav Duricic ◽  
Elisabeth Lex ◽  
Denis Helic ◽  
Roman Kern

AbstractGraph Neural Networks (GNNs) are effective in many applications. Still, there is a limited understanding of the effect of common graph structures on the learning process of GNNs. To fill this gap, we study the impact of community structure and homophily on the performance of GNNs in semi-supervised node classification on graphs. Our methodology consists of systematically manipulating the structure of eight datasets, and measuring the performance of GNNs on the original graphs and the change in performance in the presence and the absence of community structure and/or homophily. Our results show the major impact of both homophily and communities on the classification accuracy of GNNs, and provide insights on their interplay. In particular, by analyzing community structure and its correlation with node labels, we are able to make informed predictions on the suitability of GNNs for classification on a given graph. Using an information-theoretic metric for community-label correlation, we devise a guideline for model selection based on graph structure. With our work, we provide insights on the abilities of GNNs and the impact of common network phenomena on their performance. Our work improves model selection for node classification in semi-supervised settings.


2020 ◽  
Vol 34 (04) ◽  
pp. 6656-6663 ◽  
Author(s):  
Huaxiu Yao ◽  
Chuxu Zhang ◽  
Ying Wei ◽  
Meng Jiang ◽  
Suhang Wang ◽  
...  

Towards the challenging problem of semi-supervised node classification, there have been extensive studies. As a frontier, Graph Neural Networks (GNNs) have aroused great interest recently, which update the representation of each node by aggregating information of its neighbors. However, most GNNs have shallow layers with a limited receptive field and may not achieve satisfactory performance especially when the number of labeled nodes is quite small. To address this challenge, we innovatively propose a graph few-shot learning (GFL) algorithm that incorporates prior knowledge learned from auxiliary graphs to improve classification accuracy on the target graph. Specifically, a transferable metric space characterized by a node embedding and a graph-specific prototype embedding function is shared between auxiliary graphs and the target, facilitating the transfer of structural knowledge. Extensive experiments and ablation studies on four real-world graph datasets demonstrate the effectiveness of our proposed model and the contribution of each component.


2021 ◽  
pp. 106884
Author(s):  
Yao Wu ◽  
Yu Song ◽  
Hong Huang ◽  
Fanghua Ye ◽  
Xing Xie ◽  
...  

2021 ◽  
Author(s):  
Gabriel Jonas Duarte ◽  
Tamara Arruda Pereira ◽  
Erik Jhones Nascimento ◽  
Diego Mesquita ◽  
Amauri Holanda Souza Junior

Graph neural networks (GNNs) have become the de facto approach for supervised learning on graph data.To train these networks, most practitioners employ the categorical cross-entropy (CE) loss. We can attribute this largely to the probabilistic interpretability of models trained using CE, since it corresponds to the negative log of the categorical/softmax likelihood.We can attribute this largely to the probabilistic interpretation of CE, since it corresponds to the negative log of the categorical/softmax likelihood.Nonetheless, recent works have shown that deep learning models can benefit from adopting other loss functions. For instance, neural networks trained with symmetric losses (e.g., mean absolute error) are robust to label noise. Nonetheless, loss functions are a modeling choice and other training criteria can be employed — e.g., hinge loss and mean absolute error (MAE). Perhaps surprisingly, the effect of using different losses on GNNs has not been explored. In this preliminary work, we gauge the impact of different loss functions to the performance of GNNs for node classification under i) noisy labels and ii) different sample sizes. In contrast to findings on Euclidean domains, our results for GNNs show that there is no significant difference between models trained with CE and other classical loss functions on both aforementioned scenarios.


Sign in / Sign up

Export Citation Format

Share Document