scholarly journals Stacking models for nearly optimal link prediction in complex networks

2020 ◽  
Vol 117 (38) ◽  
pp. 23393-23400 ◽  
Author(s):  
Amir Ghasemian ◽  
Homa Hosseinmardi ◽  
Aram Galstyan ◽  
Edoardo M. Airoldi ◽  
Aaron Clauset

Most real-world networks are incompletely observed. Algorithms that can accurately predict which links are missing can dramatically speed up network data collection and improve network model validation. Many algorithms now exist for predicting missing links, given a partially observed network, but it has remained unknown whether a single best predictor exists, how link predictability varies across methods and networks from different domains, and how close to optimality current methods are. We answer these questions by systematically evaluating 203 individual link predictor algorithms, representing three popular families of methods, applied to a large corpus of 550 structurally diverse networks from six scientific domains. We first show that individual algorithms exhibit a broad diversity of prediction errors, such that no one predictor or family is best, or worst, across all realistic inputs. We then exploit this diversity using network-based metalearning to construct a series of “stacked” models that combine predictors into a single algorithm. Applied to a broad range of synthetic networks, for which we may analytically calculate optimal performance, these stacked models achieve optimal or nearly optimal levels of accuracy. Applied to real-world networks, stacked models are superior, but their accuracy varies strongly by domain, suggesting that link prediction may be fundamentally easier in social networks than in biological or technological networks. These results indicate that the state of the art for link prediction comes from combining individual algorithms, which can achieve nearly optimal predictions. We close with a brief discussion of limitations and opportunities for further improvements.

2014 ◽  
Vol 58 (1) ◽  
pp. 1-38 ◽  
Author(s):  
Peng Wang ◽  
BaoWen Xu ◽  
YuRong Wu ◽  
XiaoYu Zhou

2021 ◽  
Author(s):  
Mohsen Rezvani ◽  
Mojtaba Rezvani

Abstract Recent studies have shown that social networks exhibit interesting characteristics such as community structures, i.e., vertexes can be clustered into communities that are densely connected together and loosely connected to other vertices. In order to identify communities, several definitions have been proposed that can characterize the density of connections among vertices in the networks. Dense triangle cores, also known as $k$-trusses, are subgraphs in which every edge participates at least $k-2$ triangles (a clique of size 3), exhibiting a high degree of cohesiveness among vertices. There are a number of research works that propose $k$-truss decomposition algorithms. However, existing in-memory algorithms for computing $k$-truss are inefficient for handling today’s massive networks. In this paper, we propose an efficient, yet scalable algorithm for finding $k$-trusses in a large-scale network. To this end, we propose a new structure, called triangle graph to speed up the process of finding the $k$-trusses and prove the correctness and efficiency of our method. We also evaluate the performance of the proposed algorithms through extensive experiments using real-world networks. The results of comprehensive experiments show that the proposed algorithms outperform the state-of-the-art methods by several orders of magnitudes in running time.


Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-16 ◽  
Author(s):  
Longjie Li ◽  
Shenshen Bai ◽  
Mingwei Leng ◽  
Lu Wang ◽  
Xiaoyun Chen

Link prediction, which aims to forecast potential or missing links in a complex network based on currently observed information, has drawn growing attention from researchers. To date, a host of similarity-based methods have been put forward. Usually, one method harbors the idea that one similarity measure is applicable to various networks, and thus has performance fluctuation on different networks. In this paper, we propose a novel method to solve this issue by regarding link prediction as a multiple-attribute decision-making (MADM) problem. In the proposed method, we consider RA, LP, and CAR indices as the multiattribute for node pairs. The technique for order performance by similarity to ideal solution (TOPSIS) is adopted to aggregate the multiattribute and rank node pairs. The proposed method is not limited to only one similarity measure, but takes separate measures into account, since different networks may have different topological structures. Experimental results on 10 real-world networks manifest that the proposed method is superior in comparison to state-of-the-art methods.


Author(s):  
Shubham Gupta ◽  
Gaurav Sharma ◽  
Ambedkar Dukkipati

Networks observed in real world like social networks, collaboration networks etc., exhibit temporal dynamics, i.e. nodes and edges appear and/or disappear over time. In this paper, we propose a generative, latent space based, statistical model for such networks (called dynamic networks). We consider the case where the number of nodes is fixed, but the presence of edges can vary over time. Our model allows the number of communities in the network to be different at different time steps. We use a neural network based methodology to perform approximate inference in the proposed model and its simplified version. Experiments done on synthetic and real world networks for the task of community detection and link prediction demonstrate the utility and effectiveness of our model as compared to other similar existing approaches.


2015 ◽  
Vol 07 (03) ◽  
pp. 1550037 ◽  
Author(s):  
Huan Ma ◽  
Yuqing Zhu ◽  
Deying Li ◽  
Donghyun Kim ◽  
Jun Liang

The influence maximization problem in social networks is to find a set of seed nodes such that the total influence effect is maximized under certain cascade models. In this paper, we propose a novel task of improving influence, which is to find strategies to allocate the investment budget under IC-N model. We prove that our influence improving problem is 𝒩𝒫-hard, and propose new algorithms under IC-N model. To the best of our knowledge, our work is the first one that studies influence improving problem under bounded budget when negative opinions emerge. Finally, we implement extensive experiments over a large data collection obtained from real-world social networks, and evaluate the performance of our approach.


Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 771
Author(s):  
Qiang Wei ◽  
Guangmin Hu

Collected network data are often incomplete, with both missing nodes and missing edges. Thus, network completion that infers the unobserved part of the network is essential for downstream tasks. Despite the emerging literature related to network recovery, the potential information has not been effectively exploited. In this paper, we propose a novel unified deep graph convolutional network that infers missing edges by leveraging node labels, features, and distances. Specifically, we first construct an estimated network topology for the unobserved part using node labels, then jointly refine the network topology and learn the edge likelihood with node labels, node features and distances. Extensive experiments using several real-world datasets show the superiority of our method compared with the state-of-the-art approaches.


Entropy ◽  
2019 ◽  
Vol 21 (3) ◽  
pp. 254 ◽  
Author(s):  
Shaokai Wang ◽  
Xutao Li ◽  
Yunming Ye ◽  
Shanshan Feng ◽  
Raymond Lau ◽  
...  

Presently, many users are involved in multiple social networks. Identifying the same user in different networks, also known as anchor link prediction, becomes an important problem, which can serve numerous applications, e.g., cross-network recommendation, user profiling, etc. Previous studies mainly use hand-crafted structure features, which, if not carefully designed, may fail to reflect the intrinsic structure regularities. Moreover, most of the methods neglect the attribute information of social networks. In this paper, we propose a novel semi-supervised network-embedding model to address the problem. In the model, each node of the multiple networks is represented by a vector for anchor link prediction, which is learnt with awareness of observed anchor links as semi-supervised information, and topology structure and attributes as input. Experimental results on the real-world data sets demonstrate the superiority of the proposed model compared to state-of-the-art techniques.


2020 ◽  
Vol 8 (S1) ◽  
pp. S65-S81 ◽  
Author(s):  
Shazia Tabassum ◽  
Bruno Veloso ◽  
João Gama

AbstractThe link prediction task has found numerous applications in real-world scenarios. However, in most of the cases like interactions, purchases, mobility, etc., links can re-occur again and again across time. As a result, the data being generated is excessively large to handle, associated with the complexity and sparsity of networks. Therefore, we propose a very fast, memory-less, and dynamic sampling-based method for predicting recurring links for a successive future point in time. This method works by biasing the links exponentially based on their time of occurrence, frequency, and stability. To evaluate the efficiency of our method, we carried out rigorous experiments with massive real-world graph streams. Our empirical results show that the proposed method outperforms the state-of-the-art method for recurring links prediction. Additionally, we also empirically analyzed the evolution of links with the perspective of multi-graph topology and their recurrence probability over time.


Sign in / Sign up

Export Citation Format

Share Document