An effective similarity measure based on kernel spectral method for complex networks

2019 ◽  
Vol 30 (07) ◽  
pp. 1940005
Author(s):  
Longjie Li ◽  
Lu Wang ◽  
Shenshen Bai ◽  
Shiyu Fang ◽  
Jianjun Cheng ◽  
...  

Node similarity measure is a special important task in complex network analysis and plays a critical role in a multitude of applications, such as link prediction, community detection, and recommender systems. In this study, we are interested in link-based similarity measures, which only concern the structural information of networks when estimating node similarity. A new algorithm is proposed by adopting the idea of kernel spectral method to quantify the similarity of nodes. When computing the kernel matrix, the proposed algorithm makes use of local structural information, but it takes advantage of global information when constructing the feature matrix. Thence, the proposed algorithm could better capture potential relationships between nodes. To show the superiority of our algorithm over others, we conduct experiments on 10 real-world networks. Experimental results demonstrate that our algorithm yields more reasonable results and better performance of accuracy than baselines.

2017 ◽  
Vol 28 (08) ◽  
pp. 1750101 ◽  
Author(s):  
Yabing Yao ◽  
Ruisheng Zhang ◽  
Fan Yang ◽  
Yongna Yuan ◽  
Qingshuang Sun ◽  
...  

In complex networks, the existing link prediction methods primarily focus on the internal structural information derived from single-layer networks. However, the role of interlayer information is hardly recognized in multiplex networks, which provide more diverse structural features than single-layer networks. Actually, the structural properties and functions of one layer can affect that of other layers in multiplex networks. In this paper, the effect of interlayer structural properties on the link prediction performance is investigated in multiplex networks. By utilizing the intralayer and interlayer information, we propose a novel “Node Similarity Index” based on “Layer Relevance” (NSILR) of multiplex network for link prediction. The performance of NSILR index is validated on each layer of seven multiplex networks in real-world systems. Experimental results show that the NSILR index can significantly improve the prediction performance compared with the traditional methods, which only consider the intralayer information. Furthermore, the more relevant the layers are, the higher the performance is enhanced.


2020 ◽  
Author(s):  
Mustafa Coşkun ◽  
Mehmet Koyutürk

AbstractMotivationLink prediction is an important and well-studied problem in computational biology, with a broad range of applications including disease gene prioritization, drug-disease associations, and drug response in cancer. The general principle in link prediction is to use the topological characteristics and the attributes–if available– of the nodes in the network to predict new links that are likely to emerge/disappear. Recently, graph representation learning methods, which aim to learn a low-dimensional representation of topological characteristics and the attributes of the nodes, have drawn increasing attention to solve the link prediction problem via learnt low-dimensional features. Most prominently, Graph Convolution Network (GCN)-based network embedding methods have demonstrated great promise in link prediction due to their ability of capturing non-linear information of the network. To date, GCN-based network embedding algorithms utilize a Laplacian matrix in their convolution layers as the convolution matrix and the effect of the convolution matrix on algorithm performance has not been comprehensively characterized in the context of link prediction in biomedical networks. On the other hand, for a variety of biomedical link prediction tasks, traditional node similarity measures such as Common Neighbor, Ademic-Adar, and other have shown promising results, and hence there is a need to systematically evaluate the node similarity measures as convolution matrices in terms of their usability and potential to further the state-of-the-art.ResultsWe select 8 representative node similarity measures as convolution matrices within the single-layered GCN graph embedding method and conduct a systematic comparison on 3 important biomedical link prediction tasks: drug-disease association (DDA) prediction, drug–drug interaction (DDI) prediction, protein–protein interaction (PPI) prediction. Our experimental results demonstrate that the node similarity-based convolution matrices significantly improves GCN-based embedding algorithms and deserve more attention in the future biomedical link predictionAvailabilityOur method is implemented as a python library and is available at [email protected] informationSupplementary data are available at Bioinformatics online.


2013 ◽  
Vol 27 (06) ◽  
pp. 1350039 ◽  
Author(s):  
JING WANG ◽  
LILI RONG

Link prediction in complex networks has attracted much attention recently. Many local similarity measures based on the measurements of node similarity have been proposed. Among these local similarity indices, the neighborhood-based indices Common Neighbors (CN), Adamic-Adar (AA) and Resource Allocation (RA) index perform best. It is found that the node similarity indices required only information on the nearest neighbors are assigned high scores and have very low computational complexity. In this paper, a new index based on the contribution of common neighbor nodes to edges is proposed and shown to have competitively good or even better prediction than other neighborhood-based indices especially for the network with low clustering coefficient with its high efficiency and simplicity.


2019 ◽  
Vol 4 (1) ◽  
Author(s):  
Sinan G. Aksoy ◽  
Kathleen E. Nowak ◽  
Emilie Purvine ◽  
Stephen J. Young

Abstract Similarity measures are used extensively in machine learning and data science algorithms. The newly proposed graph Relative Hausdorff (RH) distance is a lightweight yet nuanced similarity measure for quantifying the closeness of two graphs. In this work we study the effectiveness of RH distance as a tool for detecting anomalies in time-evolving graph sequences. We apply RH to cyber data with given red team events, as well to synthetically generated sequences of graphs with planted attacks. In our experiments, the performance of RH distance is at times comparable, and sometimes superior, to graph edit distance in detecting anomalous phenomena. Our results suggest that in appropriate contexts, RH distance has advantages over more computationally intensive similarity measures.


2021 ◽  
Author(s):  
Amin Rezaeipanah

Abstract Online social networks are an integral element of modern societies and significantly influence the formation and consolidation of social relationships. In fact, these networks are multi-layered so that there may be multiple links between a user’ on different social networks. In this paper, the link prediction problem for the same user in a two-layer social network is examined, where we consider Twitter and Foursquare networks. Here, information related to the two-layer communication is used to predict links in the Foursquare network. Link prediction aims to discover spurious links or predict the emergence of future links from the current network structure. There are many algorithms for link prediction in unweighted networks, however only a few have been developed for weighted networks. Based on the extraction of topological features from the network structure and the use of reliable paths between users, we developed a novel similarity measure for link prediction. Reliable paths have been proposed to develop unweight local similarity measures to weighted measures. Using these measures, both the existence of links and their weight can be predicted. Empirical analysis shows that the proposed similarity measure achieves superior performance to existing approaches and can more accurately predict future relationships. In addition, the proposed method has better results compared to single-layer networks. Experiments show that the proposed similarity measure has an advantage precision of 1.8% over the Katz and FriendLink measures.


2020 ◽  
Vol 34 (04) ◽  
pp. 4772-4779 ◽  
Author(s):  
Yu Li ◽  
Yuan Tian ◽  
Jiawei Zhang ◽  
Yi Chang

Learning the low-dimensional representations of graphs (i.e., network embedding) plays a critical role in network analysis and facilitates many downstream tasks. Recently graph convolutional networks (GCNs) have revolutionized the field of network embedding, and led to state-of-the-art performance in network analysis tasks such as link prediction and node classification. Nevertheless, most of the existing GCN-based network embedding methods are proposed for unsigned networks. However, in the real world, some of the networks are signed, where the links are annotated with different polarities, e.g., positive vs. negative. Since negative links may have different properties from the positive ones and can also significantly affect the quality of network embedding. Thus in this paper, we propose a novel network embedding framework SNEA to learn Signed Network Embedding via graph Attention. In particular, we propose a masked self-attentional layer, which leverages self-attention mechanism to estimate the importance coefficient for pair of nodes connected by different type of links during the embedding aggregation process. Then SNEA utilizes the masked self-attentional layers to aggregate more important information from neighboring nodes to generate the node embeddings based on balance theory. Experimental results demonstrate the effectiveness of the proposed framework through signed link prediction task on several real-world signed network datasets.


Author(s):  
Rongrong Song ◽  
Guang Ling ◽  
Qingju Fan ◽  
Ming-Feng Ge ◽  
Fang Wang

Link prediction, aiming to find missing links in a current network or to predict some possible new links in a future network, is a challenging problem in complex networks. Many existing link prediction algorithms perform the task by optimizing the node similarity measures, and then determining the possibility of the link between any pair of similar nodes. In this paper, we propose a novel node similarity index named heterogeneous degree penalization (HDP), which incorporates the quasi-local structure information of extending neighborhood of each pair of nodes to be predicted and the clustering coefficient of their common neighbors. For specific networks with different statistical properties, we can achieve a good performance of link prediction through adjusting the penalty weights. The experiment results show that, comparing with the other existing approaches, the proposed method can remarkably improve the accuracy of link prediction.


2019 ◽  
Vol 476 (21) ◽  
pp. 3227-3240 ◽  
Author(s):  
Shanshan Wang ◽  
Yanxiang Zhao ◽  
Long Yi ◽  
Minghe Shen ◽  
Chao Wang ◽  
...  

Trehalose-6-phosphate (T6P) synthase (Tps1) catalyzes the formation of T6P from UDP-glucose (UDPG) (or GDPG, etc.) and glucose-6-phosphate (G6P), and structural basis of this process has not been well studied. MoTps1 (Magnaporthe oryzae Tps1) plays a critical role in carbon and nitrogen metabolism, but its structural information is unknown. Here we present the crystal structures of MoTps1 apo, binary (with UDPG) and ternary (with UDPG/G6P or UDP/T6P) complexes. MoTps1 consists of two modified Rossmann-fold domains and a catalytic center in-between. Unlike Escherichia coli OtsA (EcOtsA, the Tps1 of E. coli), MoTps1 exists as a mixture of monomer, dimer, and oligomer in solution. Inter-chain salt bridges, which are not fully conserved in EcOtsA, play primary roles in MoTps1 oligomerization. Binding of UDPG by MoTps1 C-terminal domain modifies the substrate pocket of MoTps1. In the MoTps1 ternary complex structure, UDP and T6P, the products of UDPG and G6P, are detected, and substantial conformational rearrangements of N-terminal domain, including structural reshuffling (β3–β4 loop to α0 helix) and movement of a ‘shift region' towards the catalytic centre, are observed. These conformational changes render MoTps1 to a ‘closed' state compared with its ‘open' state in apo or UDPG complex structures. By solving the EcOtsA apo structure, we confirmed that similar ligand binding induced conformational changes also exist in EcOtsA, although no structural reshuffling involved. Based on our research and previous studies, we present a model for the catalytic process of Tps1. Our research provides novel information on MoTps1, Tps1 family, and structure-based antifungal drug design.


Author(s):  
B. Mathura Bai ◽  
N. Mangathayaru ◽  
B. Padmaja Rani ◽  
Shadi Aljawarneh

: Missing attribute values in medical datasets are one of the most common problems faced when mining medical datasets. Estimation of missing values is a major challenging task in pre-processing of datasets. Any wrong estimate of missing attribute values can lead to inefficient and improper classification thus resulting in lower classifier accuracies. Similarity measures play a key role during the imputation process. The use of an appropriate and better similarity measure can help to achieve better imputation and improved classification accuracies. This paper proposes a novel imputation measure for finding similarity between missing and non-missing instances in medical datasets. Experiments are carried by applying both the proposed imputation technique and popular benchmark existing imputation techniques. Classification is carried using KNN, J48, SMO and RBFN classifiers. Experiment analysis proved that after imputation of medical records using proposed imputation technique, the resulting classification accuracies reported by the classifiers KNN, J48 and SMO have improved when compared to other existing benchmark imputation techniques.


Sign in / Sign up

Export Citation Format

Share Document