scholarly journals Classification and prediction of protein–protein interaction interface using machine learning algorithm

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Subhrangshu Das ◽  
Saikat Chakrabarti

AbstractStructural insight of the protein–protein interaction (PPI) interface can provide knowledge about the kinetics, thermodynamics and molecular functions of the complex while elucidating its role in diseases and further enabling it as a potential therapeutic target. However, owing to experimental lag in solving protein–protein complex structures, three-dimensional (3D) knowledge of the PPI interfaces can be gained via computational approaches like molecular docking and post-docking analyses. Despite development of numerous docking tools and techniques, success in identification of native like interfaces based on docking score functions is limited. Hence, we employed an in-depth investigation of the structural features of the interface that might successfully delineate native complexes from non-native ones. We identify interface properties, which show statistically significant difference between native and non-native interfaces belonging to homo and hetero, protein–protein complexes. Utilizing these properties, a support vector machine (SVM) based classification scheme has been implemented to differentiate native and non-native like complexes generated using docking decoys. Benchmarking and comparative analyses suggest very good performance of our SVM classifiers. Further, protein interactions, which are proven via experimental findings but not resolved structurally, were subjected to this approach where 3D-models of the complexes were generated and most likely interfaces were predicted. A web server called Protein Complex Prediction by Interface Properties (PCPIP) is developed to predict whether interface of a given protein–protein dimer complex resembles known protein interfaces. The server is freely available at http://www.hpppi.iicb.res.in/pcpip/.

2020 ◽  
Vol 27 (37) ◽  
pp. 6306-6355 ◽  
Author(s):  
Marian Vincenzi ◽  
Flavia Anna Mercurio ◽  
Marilisa Leone

Background:: Many pathways regarding healthy cells and/or linked to diseases onset and progression depend on large assemblies including multi-protein complexes. Protein-protein interactions may occur through a vast array of modules known as protein interaction domains (PIDs). Objective:: This review concerns with PIDs recognizing post-translationally modified peptide sequences and intends to provide the scientific community with state of art knowledge on their 3D structures, binding topologies and potential applications in the drug discovery field. Method:: Several databases, such as the Pfam (Protein family), the SMART (Simple Modular Architecture Research Tool) and the PDB (Protein Data Bank), were searched to look for different domain families and gain structural information on protein complexes in which particular PIDs are involved. Recent literature on PIDs and related drug discovery campaigns was retrieved through Pubmed and analyzed. Results and Conclusion:: PIDs are rather versatile as concerning their binding preferences. Many of them recognize specifically only determined amino acid stretches with post-translational modifications, a few others are able to interact with several post-translationally modified sequences or with unmodified ones. Many PIDs can be linked to different diseases including cancer. The tremendous amount of available structural data led to the structure-based design of several molecules targeting protein-protein interactions mediated by PIDs, including peptides, peptidomimetics and small compounds. More studies are needed to fully role out, among different families, PIDs that can be considered reliable therapeutic targets, however, attacking PIDs rather than catalytic domains of a particular protein may represent a route to obtain selective inhibitors.


2017 ◽  
Vol 114 (40) ◽  
pp. E8333-E8342 ◽  
Author(s):  
Maximilian G. Plach ◽  
Florian Semmelmann ◽  
Florian Busch ◽  
Markus Busch ◽  
Leonhard Heizinger ◽  
...  

Cells contain a multitude of protein complexes whose subunits interact with high specificity. However, the number of different protein folds and interface geometries found in nature is limited. This raises the question of how protein–protein interaction specificity is achieved on the structural level and how the formation of nonphysiological complexes is avoided. Here, we describe structural elements called interface add-ons that fulfill this function and elucidate their role for the diversification of protein–protein interactions during evolution. We identified interface add-ons in 10% of a representative set of bacterial, heteromeric protein complexes. The importance of interface add-ons for protein–protein interaction specificity is demonstrated by an exemplary experimental characterization of over 30 cognate and hybrid glutamine amidotransferase complexes in combination with comprehensive genetic profiling and protein design. Moreover, growth experiments showed that the lack of interface add-ons can lead to physiologically harmful cross-talk between essential biosynthetic pathways. In sum, our complementary in silico, in vitro, and in vivo analysis argues that interface add-ons are a practical and widespread evolutionary strategy to prevent the formation of nonphysiological complexes by specializing protein–protein interactions.


2019 ◽  
Author(s):  
Franziska Seeger ◽  
Anna Little ◽  
Yang Chen ◽  
Tina Woolf ◽  
Haiyan Cheng ◽  
...  

AbstractProtein-protein interactions regulate many essential biological processes and play an important role in health and disease. The process of experimentally charac-terizing protein residues that contribute the most to protein-protein interaction affin-ity and specificity is laborious. Thus, developing models that accurately characterize hotspots at protein-protein interfaces provides important information about how to inhibit therapeutically relevant protein-protein interactions. During the course of the ICERM WiSDM workshop 2017, we combined the KFC2a protein-protein interaction hotspot prediction features with Rosetta scoring function terms and interface filter metrics. A 2-way and 3-way forward selection strategy was employed to train support vector machine classifiers, as was a reverse feature elimination strategy. From these results, we identified subsets of KFC2a and Rosetta combined features that show improved performance over KFC2a features alone.


Author(s):  
Morihiro Hayashida ◽  
Tatsuya Akutsu

Protein-protein interactions play various essential roles in cellular systems. Many methods have been developed for inference of protein-protein interactions from protein sequence data. In this paper, the authors focus on methods based on domain-domain interactions, where a domain is defined as a region within a protein that either performs a specific function or constitutes a stable structural unit. In these methods, the probabilities of domain-domain interactions are inferred from known protein-protein interaction data and protein domain data, and then prediction of interactions is performed based on these probabilities and contents of domains of given proteins. This paper overviews several fundamental methods, which include association method, expectation maximization-based method, support vector machine-based method, linear programming-based method, and conditional random field-based method. This paper also reviews a simple evolutionary model of protein domains, which yields a scale-free distribution of protein domains. By combining with a domain-based protein interaction model, a scale-free distribution of protein-protein interaction networks is also derived.


2019 ◽  
Vol 16 (4) ◽  
pp. 263-274
Author(s):  
Chunhua Zhang ◽  
Sijia Guo ◽  
Jingbo Zhang ◽  
Xizi Jin ◽  
Yanwen Li ◽  
...  

Protein-protein interactions play an important role in biological and cellular processes. Biochemistry experiment is the most reliable approach identifying protein-protein interactions, but it is time-consuming and expensive. It is one of the important reasons why there is only a little fraction of complete protein-protein interactions networks available by far. Hence, accurate computational methods are in a great need to predict protein-protein interactions. In this work, we proposed a new weighted feature fusion algorithm for protein-protein interactions prediction, which extracts both protein sequence feature and evolutionary feature, for the purpose to use both global and local information to identify protein-protein interactions. The method employs maximum margin criterion for feature selection and support vector machine for classification. Experimental results on 11188 protein pairs showed that our method had better performance and robustness. Performed on the independent database of Helicobacter pylori, the method achieved 99.59% sensitivity and 93.66% prediction accuracy, while the maximum margin criterion is 88.03%. The results indicated that our method was more efficient in predicting protein-protein interaction compared with other six state-of-the-art peer methods.


2015 ◽  
Vol 2015 ◽  
pp. 1-9
Author(s):  
Peng Liu ◽  
Lei Yang ◽  
Daming Shi ◽  
Xianglong Tang

A method for predicting protein-protein interactions based on detected protein complexes is proposed to repair deficient interactions derived from high-throughput biological experiments. Protein complexes are pruned and decomposed into small parts based on the adaptivek-cores method to predict protein-protein interactions associated with the complexes. The proposed method is adaptive to protein complexes with different structure, number, and size of nodes in a protein-protein interaction network. Based on different complex sets detected by various algorithms, we can obtain different prediction sets of protein-protein interactions. The reliability of the predicted interaction sets is proved by using estimations with statistical tests and direct confirmation of the biological data. In comparison with the approaches which predict the interactions based on the cliques, the overlap of the predictions is small. Similarly, the overlaps among the predicted sets of interactions derived from various complex sets are also small. Thus, every predicted set of interactions may complement and improve the quality of the original network data. Meanwhile, the predictions from the proposed method replenish protein-protein interactions associated with protein complexes using only the network topology.


2020 ◽  
Vol 11 (4) ◽  
pp. 7539-7548
Author(s):  
Christina Nilofer ◽  
Arumugam Mohanapriya

Two or more proteins interact in vivo to perform complex molecular functions including catalysis, regulation, assembly, immunity and inhibition through the formation of stable interfaces. This interaction is governed by several factors that are selective, sensitive and specific in nature. Several interface features has been documented since 1975. The study of these interface features of proteins and their dynamicity during interaction with different proteins help understanding the mechanisms underlying diverse molecular functions and its biological processes. Computational tools greatly assist in studying such interface features that determine the interaction between two or more proteins, and in this context, this review enumerates the different interface features reported thus far along with the tools that aid in deciphering protein features (physicochemical characteristics, binding site and interface residue prediction and hotspot residues) along with their approaches that are employed in the prediction these features. Also, the review discusses the advantages and limitations of experimental techniques and computational biological tools deployed for deciphering the protein-protein interactions. Altogether, the review will provide insights into the optimal tools and different strategies involved in protein interaction studies that would facilitate the researchers to understand the protein structural features and molecular principles of protein-protein interaction with known functions.


1998 ◽  
Vol 76 (2-3) ◽  
pp. 351-358 ◽  
Author(s):  
Katherine LB Borden

The cysteine-rich zinc-binding motifs known as the RING and B-box are found in several unrelated proteins. Structural, biochemical, and biological studies of these motifs reveal that they mediate protein-protein interactions. Several RING-containing proteins are oncoproteins and recent data indicate that proapoptotic activities can be mediated through the RING. 1H NMR methods were used to determine the structures of RINGs and a B-box domain and to monitor the conformational changes these motifs undergo upon zinc ligation. This review discusses in detail the structural features of the RING and B-box domains. Further, possible structure function relationships for these motifs particularly in their role as protein interaction domains are discussed.Key words: RING, B-box, PML, NMR.


2014 ◽  
Author(s):  
Thomas A. Hopf ◽  
Charlotta P.I. Schärfe ◽  
João P.G.L.M. Rodrigues ◽  
Anna G. Green ◽  
Chris Sander ◽  
...  

Protein-protein interactions are fundamental to many biological processes. Experimental screens have identified tens of thousands of interactions and structural biology has provided detailed functional insight for select 3D protein complexes. An alternative rich source of information about protein interactions is the evolutionary sequence record. Building on earlier work, we show that analysis of correlated evolutionary sequence changes across proteins identifies residues that are close in space with sufficient accuracy to determine the three-dimensional structure of the protein complexes. We evaluate prediction performance in blinded tests on 76 complexes of known 3D structure, predict protein-protein contacts in 32 complexes of unknown structure, and demonstrate how evolutionary couplings can be used to distinguish between interacting and non-interacting protein pairs in a large complex. With the current growth of sequence databases, we expect that the method can be generalized to genome-wide elucidation of protein-protein interaction networks and used for interaction predictions at residue resolution.


2016 ◽  
Author(s):  
Claudio Mirabello ◽  
Björn Wallner

AbstractProtein-protein interactions (PPI) are crucial for protein function. There exist many techniques to identify PPIs experimentally, but to determine the interactions in molecular detail is still difficult and very time-consuming. The fact that the number of PPIs is vastly larger than the number of individual proteins makes it practically impossible to characterize all interactions experimentally. Computational approaches that can bridge this gap and predict PPIs and model the interactions in molecular detail are greatly needed. Here we present InterPred, a fully automated pipeline that predicts and model PPIs from sequence using structural modelling combined with massive structural comparisons and molecular docking. A key component of the method is the use of a novel random forest classifier that integrate several structural features to distinguish correct from incorrect protein-protein interaction models. We show that InterPred represents a major improvement in protein-protein interaction detection with a performance comparable or better than experimental high-throughput techniques. We also show that our full-atom protein-protein complex modelling pipeline performs better than state of the art protein docking methods on a standard benchmark set. In addition, InterPred was also one of the top predictors in the latest CAPRI37 experiment.InterPred source code can be downloaded from http://wallnerlab.org/InterPred


Sign in / Sign up

Export Citation Format

Share Document