Stacking models for nearly optimal link prediction in complex networks

Most real-world networks are incompletely observed. Algorithms that can accurately predict which links are missing can dramatically speed up network data collection and improve network model validation. Many algorithms now exist for predicting missing links, given a partially observed network, but it has remained unknown whether a single best predictor exists, how link predictability varies across methods and networks from different domains, and how close to optimality current methods are. We answer these questions by systematically evaluating 203 individual link predictor algorithms, representing three popular families of methods, applied to a large corpus of 550 structurally diverse networks from six scientific domains. We first show that individual algorithms exhibit a broad diversity of prediction errors, such that no one predictor or family is best, or worst, across all realistic inputs. We then exploit this diversity using network-based metalearning to construct a series of “stacked” models that combine predictors into a single algorithm. Applied to a broad range of synthetic networks, for which we may analytically calculate optimal performance, these stacked models achieve optimal or nearly optimal levels of accuracy. Applied to real-world networks, stacked models are superior, but their accuracy varies strongly by domain, suggesting that link prediction may be fundamentally easier in social networks than in biological or technological networks. These results indicate that the state of the art for link prediction comes from combining individual algorithms, which can achieve nearly optimal predictions. We close with a brief discussion of limitations and opportunities for further improvements.

Download Full-text

Explainable and Efficient Link Prediction in Real-World Network Data

Lecture Notes in Computer Science - Advances in Intelligent Data Analysis XV ◽

10.1007/978-3-319-46349-0_26 ◽

2016 ◽

pp. 295-307 ◽

Cited By ~ 1

Author(s):

Jesper E. van Engelen ◽

Hanjo D. Boekhout ◽

Frank W. Takes

Keyword(s):

Real World ◽

Link Prediction ◽

Network Data

Download Full-text

Link prediction in social networks: the state-of-the-art

Science China Information Sciences ◽

10.1007/s11432-014-5237-y ◽

2014 ◽

Vol 58 (1) ◽

pp. 1-38 ◽

Cited By ~ 127

Author(s):

Peng Wang ◽

BaoWen Xu ◽

YuRong Wu ◽

XiaoYu Zhou

Keyword(s):

Social Networks ◽

Link Prediction ◽

State Of The Art ◽

The State

Download Full-text

Truss Decomposition using Triangle Graphs

10.21203/rs.3.rs-819379/v1 ◽

2021 ◽

Author(s):

Mohsen Rezvani ◽

Mojtaba Rezvani

Keyword(s):

Social Networks ◽

Large Scale ◽

State Of The Art ◽

Decomposition Algorithms ◽

Scalable Algorithm ◽

Large Scale Network ◽

Triangle Graph ◽

Speed Up ◽

Scale Network ◽

High Degree

Abstract Recent studies have shown that social networks exhibit interesting characteristics such as community structures, i.e., vertexes can be clustered into communities that are densely connected together and loosely connected to other vertices. In order to identify communities, several definitions have been proposed that can characterize the density of connections among vertices in the networks. Dense triangle cores, also known as $k$-trusses, are subgraphs in which every edge participates at least $k-2$ triangles (a clique of size 3), exhibiting a high degree of cohesiveness among vertices. There are a number of research works that propose $k$-truss decomposition algorithms. However, existing in-memory algorithms for computing $k$-truss are inefficient for handling today’s massive networks. In this paper, we propose an efficient, yet scalable algorithm for finding $k$-trusses in a large-scale network. To this end, we propose a new structure, called triangle graph to speed up the process of finding the $k$-trusses and prove the correctness and efficiency of our method. We also evaluate the performance of the proposed algorithms through extensive experiments using real-world networks. The results of comprehensive experiments show that the proposed algorithms outperform the state-of-the-art methods by several orders of magnitudes in running time.

Download Full-text

Finding Missing Links in Complex Networks: A Multiple-Attribute Decision-Making Method

Complexity ◽

10.1155/2018/3579758 ◽

2018 ◽

Vol 2018 ◽

pp. 1-16 ◽

Cited By ~ 3

Author(s):

Longjie Li ◽

Shenshen Bai ◽

Mingwei Leng ◽

Lu Wang ◽

Xiaoyun Chen

Keyword(s):

Decision Making ◽

Complex Networks ◽

Similarity Measure ◽

Real World ◽

Link Prediction ◽

State Of The Art ◽

Multiple Attribute Decision Making ◽

Ideal Solution ◽

Multiple Attribute ◽

Novel Method

Link prediction, which aims to forecast potential or missing links in a complex network based on currently observed information, has drawn growing attention from researchers. To date, a host of similarity-based methods have been put forward. Usually, one method harbors the idea that one similarity measure is applicable to various networks, and thus has performance fluctuation on different networks. In this paper, we propose a novel method to solve this issue by regarding link prediction as a multiple-attribute decision-making (MADM) problem. In the proposed method, we consider RA, LP, and CAR indices as the multiattribute for node pairs. The technique for order performance by similarity to ideal solution (TOPSIS) is adopted to aggregate the multiattribute and rank node pairs. The proposed method is not limited to only one similarity measure, but takes separate measures into account, since different networks may have different topological structures. Experimental results on 10 real-world networks manifest that the proposed method is superior in comparison to state-of-the-art methods.

Download Full-text

A Generative Model for Dynamic Networks with Applications

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33017842 ◽

2019 ◽

Vol 33 ◽

pp. 7842-7849

Author(s):

Shubham Gupta ◽

Gaurav Sharma ◽

Ambedkar Dukkipati

Keyword(s):

Neural Network ◽

Social Networks ◽

Real World ◽

Link Prediction ◽

Temporal Dynamics ◽

Dynamic Networks ◽

Collaboration Networks ◽

Proposed Model ◽

Latent Space ◽

Over Time

Networks observed in real world like social networks, collaboration networks etc., exhibit temporal dynamics, i.e. nodes and edges appear and/or disappear over time. In this paper, we propose a generative, latent space based, statistical model for such networks (called dynamic networks). We consider the case where the number of nodes is fixed, but the presence of edges can vary over time. Our model allows the number of communities in the network to be different at different time steps. We use a neural network based methodology to perform approximate inference in the proposed model and its simplified version. Experiments done on synthetic and real world networks for the task of community detection and link prediction demonstrate the utility and effectiveness of our model as compared to other similar existing approaches.

Download Full-text

Improving the influence under IC-N model in social networks

Discrete Mathematics Algorithms and Applications ◽

10.1142/s1793830915500378 ◽

2015 ◽

Vol 07 (03) ◽

pp. 1550037 ◽

Cited By ~ 3

Author(s):

Huan Ma ◽

Yuqing Zhu ◽

Deying Li ◽

Donghyun Kim ◽

Jun Liang

Keyword(s):

Social Networks ◽

Data Collection ◽

Real World ◽

Large Data ◽

Maximization Problem ◽

Influence Effect ◽

Cascade Models ◽

Total Influence ◽

New Algorithms ◽

Influence Maximization Problem

The influence maximization problem in social networks is to find a set of seed nodes such that the total influence effect is maximized under certain cascade models. In this paper, we propose a novel task of improving influence, which is to find strategies to allocate the investment budget under IC-N model. We prove that our influence improving problem is 𝒩𝒫-hard, and propose new algorithms under IC-N model. To the best of our knowledge, our work is the first one that studies influence improving problem under bounded budget when negative opinions emerge. Finally, we implement extensive experiments over a large data collection obtained from real-world social networks, and evaluate the performance of our approach.

Download Full-text

Unifying Node Labels, Features, and Distances for Deep Network Completion

Entropy ◽

10.3390/e23060771 ◽

2021 ◽

Vol 23 (6) ◽

pp. 771

Author(s):

Qiang Wei ◽

Guangmin Hu

Keyword(s):

Network Topology ◽

Real World ◽

State Of The Art ◽

The State ◽

Network Data ◽

Convolutional Network ◽

Deep Network ◽

Network Recovery ◽

Real World Datasets ◽

Node Labels

Collected network data are often incomplete, with both missing nodes and missing edges. Thus, network completion that infers the unobserved part of the network is essential for downstream tasks. Despite the emerging literature related to network recovery, the potential information has not been effectively exploited. In this paper, we propose a novel unified deep graph convolutional network that infers missing edges by leveraging node labels, features, and distances. Specifically, we first construct an estimated network topology for the unobserved part using node labels, then jointly refine the network topology and learn the edge likelihood with node labels, node features and distances. Extensive experiments using several real-world datasets show the superiority of our method compared with the state-of-the-art approaches.

Download Full-text

Anchor Link Prediction across Attributed Networks via Network Embedding

Entropy ◽

10.3390/e21030254 ◽

2019 ◽

Vol 21 (3) ◽

pp. 254 ◽

Cited By ~ 4

Author(s):

Shaokai Wang ◽

Xutao Li ◽

Yunming Ye ◽

Shanshan Feng ◽

Raymond Lau ◽

...

Keyword(s):

Social Networks ◽

Link Prediction ◽

State Of The Art ◽

User Profiling ◽

Data Sets ◽

Network Embedding ◽

Real World Data ◽

Intrinsic Structure ◽

Multiple Networks ◽

Proposed Model

Presently, many users are involved in multiple social networks. Identifying the same user in different networks, also known as anchor link prediction, becomes an important problem, which can serve numerous applications, e.g., cross-network recommendation, user profiling, etc. Previous studies mainly use hand-crafted structure features, which, if not carefully designed, may fail to reflect the intrinsic structure regularities. Moreover, most of the methods neglect the attribute information of social networks. In this paper, we propose a novel semi-supervised network-embedding model to address the problem. In the model, each node of the multiple networks is represented by a vector for anchor link prediction, which is learnt with awareness of observed anchor links as semi-supervised information, and topology structure and attributes as input. Experimental results on the real-world data sets demonstrate the superiority of the proposed model compared to state-of-the-art techniques.

Download Full-text

On fast and scalable recurring link’s prediction in evolving multi-graph streams

Network Science ◽

10.1017/nws.2019.64 ◽

2020 ◽

Vol 8 (S1) ◽

pp. S65-S81 ◽

Cited By ~ 1

Author(s):

Shazia Tabassum ◽

Bruno Veloso ◽

João Gama

Keyword(s):

Real World ◽

Link Prediction ◽

State Of The Art ◽

Occurrence Frequency ◽

Graph Topology ◽

Empirical Results ◽

Graph Streams ◽

Dynamic Sampling ◽

Fast Memory ◽

Over Time

AbstractThe link prediction task has found numerous applications in real-world scenarios. However, in most of the cases like interactions, purchases, mobility, etc., links can re-occur again and again across time. As a result, the data being generated is excessively large to handle, associated with the complexity and sparsity of networks. Therefore, we propose a very fast, memory-less, and dynamic sampling-based method for predicting recurring links for a successive future point in time. This method works by biasing the links exponentially based on their time of occurrence, frequency, and stability. To evaluate the efficiency of our method, we carried out rigorous experiments with massive real-world graph streams. Our empirical results show that the proposed method outperforms the state-of-the-art method for recurring links prediction. Additionally, we also empirically analyzed the evolution of links with the perspective of multi-graph topology and their recurrence probability over time.

Download Full-text

Pedestrian network data collection through location-based social networks

Proceedings of the 5th International ICST Conference on Collaborative Computing: Networking, Applications, Worksharing ◽

10.4108/icst.collaboratecom2009.8388 ◽

2009 ◽

Cited By ~ 6

Author(s):

Piyawan Kasemsuppakorn ◽

Hassan A. Karimi

Keyword(s):

Social Networks ◽

Data Collection ◽

Network Data ◽

Pedestrian Network ◽

Location Based Social Networks

Download Full-text