Building and analysis of protein-protein interactions related to diabetes mellitus using support vector machine, biomedical text mining and network analysis

2016 ◽  
Vol 65 ◽  
pp. 37-44 ◽  
Author(s):  
Renu Vyas ◽  
Sanket Bapat ◽  
Esha Jain ◽  
Muthukumarasamy Karthikeyan ◽  
Sanjeev Tambe ◽  
...  
2012 ◽  
Vol 2012 ◽  
pp. 1-23
Author(s):  
J. M. Urquiza ◽  
I. Rojas ◽  
H. Pomares ◽  
J. Herrera ◽  
J. P. Florido ◽  
...  

Protein-protein interactions (PPIs) play a crucial role in cellular processes. In the present work, a new approach is proposed to construct a PPI predictor training a support vector machine model through a mutual information filter-wrapper parallel feature selection algorithm and an iterative and hierarchical clustering to select a relevance negative training set. By means of a selected suboptimum set of features, the constructed support vector machine model is able to classify PPIs with high accuracy in any positive and negative datasets.


PROTEOMICS ◽  
2005 ◽  
Vol 5 (4) ◽  
pp. 876-884 ◽  
Author(s):  
Siaw Ling Lo ◽  
Cong Zhong Cai ◽  
Yu Zong Chen ◽  
Maxey C. M. Chung

2014 ◽  
Vol 11 (90) ◽  
pp. 20130860 ◽  
Author(s):  
Véronique Hamon ◽  
Raphael Bourgeas ◽  
Pierre Ducrot ◽  
Isabelle Theret ◽  
Laura Xuereb ◽  
...  

Over the last 10 years, protein–protein interactions (PPIs) have shown increasing potential as new therapeutic targets. As a consequence, PPIs are today the most screened target class in high-throughput screening (HTS). The development of broad chemical libraries dedicated to these particular targets is essential; however, the chemical space associated with this ‘high-hanging fruit’ is still under debate. Here, we analyse the properties of 40 non-redundant small molecules present in the 2P2I database ( http://2p2idb.cnrs-mrs.fr/ ) to define a general profile of orthosteric inhibitors and propose an original protocol to filter general screening libraries using a support vector machine (SVM) with 11 standard D ragon molecular descriptors. The filtering protocol has been validated using external datasets from PubChem BioAssay and results from in-house screening campaigns . This external blind validation demonstrated the ability of the SVM model to reduce the size of the filtered chemical library by eliminating up to 96% of the compounds as well as enhancing the proportion of active compounds by up to a factor of 8. We believe that the resulting chemical space identified in this paper will provide the scientific community with a concrete support to search for PPI inhibitors during HTS campaigns.


Author(s):  
Varsha D Badal ◽  
Petras J Kundrotas ◽  
Ilya A Vakser

Abstract Motivation Procedures for structural modeling of protein-protein complexes (protein docking) produce a number of models which need to be further analyzed and scored. Scoring can be based on independently determined constraints on the structure of the complex, such as knowledge of amino acids essential for the protein interaction. Previously, we showed that text mining of residues in freely available PubMed abstracts of papers on studies of protein-protein interactions may generate such constraints. However, absence of post-processing of the spotted residues reduced usability of the constraints, as a significant number of the residues were not relevant for the binding of the specific proteins. Results We explored filtering of the irrelevant residues by two machine learning approaches, Deep Recursive Neural Network (DRNN) and Support Vector Machine (SVM) models with different training/testing schemes. The results showed that the DRNN model is superior to the SVM model when training is performed on the PMC-OA full-text articles and applied to classification (interface or non-interface) of the residues spotted in the PubMed abstracts. When both training and testing is performed on full-text articles or on abstracts, the performance of these models is similar. Thus, in such cases, there is no need to utilize computationally demanding DRNN approach, which is computationally expensive especially at the training stage. The reason is that SVM success is often determined by the similarity in data/text patterns in the training and the testing sets, whereas the sentence structures in the abstracts are, in general, different from those in the full text articles. Availability The code and the datasets generated in this study are available at https://gitlab.ku.edu/vakser-lab-public/text-mining/-/tree/2020-09-04. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 40 (11) ◽  
pp. 1233-1242 ◽  
Author(s):  
Sandra Romero-Molina ◽  
Yasser B. Ruiz-Blanco ◽  
Mirja Harms ◽  
Jan Münch ◽  
Elsa Sanchez-Garcia

Sign in / Sign up

Export Citation Format

Share Document