scholarly journals Protein Structure Prediction Using Homology Modeling

2017 ◽  
pp. 877-897 ◽  
Author(s):  
Akanksha Gupta ◽  
Pallavi Mohanty ◽  
Sonika Bhatnagar

Sequence-structure deficit marks one of the critical problems in today's scenario where high-throughput sequencing has resulted in large datasets of protein sequences but their corresponding 3D structures still needs to be determined. Homology modeling, also termed as Comparative modeling refers to modeling of 3D structure of a protein by exploiting structural information from other known protein structures with good sequence similarity. Homology models contain sufficient information about the spatial arrangement of important residues in the protein and are often used in drug design for screening of large libraries by molecular docking techniques. This chapter provides a brief description about protein tertiary structure prediction and Homology modeling. The authors provide a description of the steps involved in homology modeling protocols and provide information on the various resources available for the same.

Author(s):  
Akanksha Gupta ◽  
Pallavi Mohanty ◽  
Sonika Bhatnagar

Sequence-structure deficit marks one of the critical problems in today's scenario where high-throughput sequencing has resulted in large datasets of protein sequences but their corresponding 3D structures still needs to be determined. Homology modeling, also termed as Comparative modeling refers to modeling of 3D structure of a protein by exploiting structural information from other known protein structures with good sequence similarity. Homology models contain sufficient information about the spatial arrangement of important residues in the protein and are often used in drug design for screening of large libraries by molecular docking techniques. This chapter provides a brief description about protein tertiary structure prediction and Homology modeling. The authors provide a description of the steps involved in homology modeling protocols and provide information on the various resources available for the same.


Author(s):  
Raghunath Satpathy

Proteins play a vital molecular role in all living organisms. Experimentally, it is difficult to predict the protein structure, however alternatively theoretical prediction method holds good for it. The 3D structure prediction of proteins is very much important in biology and this leads to the discovery of different useful drugs, enzymes, and currently this is considered as an important research domain. The prediction of proteins is related to identification of its tertiary structure. From the computational point of view, different models (protein representations) have been developed along with certain efficient optimization methods to predict the protein structure. The bio-inspired computation is used mostly for optimization process during solving protein structure. These algorithms now a days has received great interests and attention in the literature. This chapter aim basically for discussing the key features of recently developed five different types of bio-inspired computational algorithms, applied in protein structure prediction problems.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Marcin Magnus ◽  
Kalli Kappel ◽  
Rhiju Das ◽  
Janusz M. Bujnicki

Abstract Background The understanding of the importance of RNA has dramatically changed over recent years. As in the case of proteins, the function of an RNA molecule is encoded in its tertiary structure, which in turn is determined by the molecule’s sequence. The prediction of tertiary structures of complex RNAs is still a challenging task. Results Using the observation that RNA sequences from the same RNA family fold into conserved structure, we test herein whether parallel modeling of RNA homologs can improve ab initio RNA structure prediction. EvoClustRNA is a multi-step modeling process, in which homologous sequences for the target sequence are selected using the Rfam database. Subsequently, independent folding simulations using Rosetta FARFAR and SimRNA are carried out. The model of the target sequence is selected based on the most common structural arrangement of the common helical fragments. As a test, on two blind RNA-Puzzles challenges, EvoClustRNA predictions ranked as the first of all submissions for the L-glutamine riboswitch and as the second for the ZMP riboswitch. Moreover, through a benchmark of known structures, we discovered several cases in which particular homologs were unusually amenable to structure recovery in folding simulations compared to the single original target sequence. Conclusion This work, for the first time to our knowledge, demonstrates the importance of the selection of the target sequence from an alignment of an RNA family for the success of RNA 3D structure prediction. These observations prompt investigations into a new direction of research for checking 3D structure “foldability” or “predictability” of related RNA sequences to obtain accurate predictions. To support new research in this area, we provide all relevant scripts in a documented and ready-to-use form. By exploring new ideas and identifying limitations of the current RNA 3D structure prediction methods, this work is bringing us closer to the near-native computational RNA 3D models.


2019 ◽  
Author(s):  
Marcin Magnus ◽  
Kalli Kappel ◽  
Rhiju Das ◽  
Janusz Bujnicki

Abstract Background The understanding of the importance of RNA has dramatically changed over recent years. As in the case of proteins, the function of an RNA molecule is encoded in its tertiary structure, which in turn is determined by the molecule's sequence. The prediction of tertiary structures of complex RNAs is still a challenging task. Results Using the observation that RNA sequences from the same RNA family fold into conserved structure, we test herein whether parallel modeling of RNA homologs can improve ab initio RNA structure prediction method. EvoClustRNA is a multi-step modeling process, in which homologous sequences for the target sequence are selected using the Rfam database. Subsequently, independent folding simulations using Rosetta FARFAR and SimRNA are carried out. The model of the target sequence is selected based on the most common structural arrangement of the common helical fragments. As a test, on two blind RNA-Puzzles challenges, EvoClustRNA predictions ranked as the first of all submissions for the L-glutamine riboswitch and as the second for the ZMP riboswitch. Moreover, through a benchmark of known structures, we discovered several cases in which particular homologs were unusually amenable to structure recovery in folding simulations compared to the single original target sequence. Conclusion This work, for the first time to our knowledge, demonstrates how important is the selection of the target sequence from an alignment of an RNA family for the success of RNA 3D structure prediction. These observations prompt investigations into a new direction of research for checking 3D structure “foldability” or “predictability” of related RNA sequences to obtain accurate predictions. To support new research in this area, we provide all relevant scripts in a documented and ready-to-use form. By exploring new ideas and identification of limitations of the current RNA 3D structure prediction methods, this work is bringing us closer to the near-native computational RNA 3D models.


2019 ◽  
Author(s):  
Marcin Magnus ◽  
Kalli Kappel ◽  
Rhiju Das ◽  
Janusz Bujnicki

Abstract Background The understanding of the importance of RNA has dramatically changed over recent years. As in the case of proteins, the function of an RNA molecule is encoded in its tertiary structure, which in turn is determined by the molecule's sequence. The prediction of tertiary structures of complex RNAs is still a challenging task. Results Using the observation that RNA sequences from the same RNA family fold into conserved structure, we test herein whether parallel modeling of RNA homologs can improve ab initio RNA structure prediction method. EvoClustRNA is a multi- step modeling process, in which homologous sequences for the target sequence are selected using the Rfam database. Subsequently, independent folding simulations using Rosetta FARFAR and SimRNA are carried out. The model of the target sequence is selected based on the most common structural arrangement of the common helical fragments. As a test, on two blind RNA-Puzzles challenges, EvoClustRNA predictions ranked as the first of all submissions for the L-glutamine riboswitch and as the second for the ZMP riboswitch. Conclusion Through a benchmark of known structures, we discovered several cases in which particular homologs were unusually amenable to structure recovery in folding simulations compared to the single original target sequence.


2021 ◽  
Author(s):  
Michael Heinzinger ◽  
Maria Littmann ◽  
Ian Sillitoe ◽  
Nicola Bordin ◽  
Christine Orengo ◽  
...  

Thanks to the recent advances in protein three-dimensional (3D) structure prediction, in particular through AlphaFold 2 and RoseTTAFold, the abundance of protein 3D information will explode over the next year(s). Expert resources based on 3D structures such as SCOP and CATH have been organizing the complex sequence-structure-function relations into a hierarchical classification schema. Experimental structures are leveraged through multiple sequence alignments, or more generally through homology-based inference (HBI) transferring annotations from a protein with experimentally known annotation to a query without annotation. Here, we presented a novel approach that expands the concept of HBI from a low-dimensional sequence-distance lookup to the level of a high-dimensional embedding-based annotation transfer (EAT). Secondly, we introduced a novel solution using single protein sequence representations from protein Language Models (pLMs), so called embeddings (Prose, ESM-1b, ProtBERT, and ProtT5), as input to contrastive learning, by which a new set of embeddings was created that optimized constraints captured by hierarchical classifications of protein 3D structures. These new embeddings (dubbed ProtTucker) clearly improved what was historically referred to as threading or fold recognition. Thereby, the new embeddings enabled the intrusion into the midnight zone of protein comparisons, i.e., the region in which the level of pairwise sequence similarity is akin of random relations and therefore is hard to navigate by HBI methods. Cautious benchmarking showed that ProtTucker reached much further than advanced sequence comparisons without the need to compute alignments allowing it to be orders of magnitude faster. Code is available at https://github.com/Rostlab/EAT .


2019 ◽  
Vol 20 (9) ◽  
pp. 2291 ◽  
Author(s):  
Sultan N. Alharbi ◽  
Ibtehal S. Alduhaymi ◽  
Lama Alqahtani ◽  
Musaad A. Altammaami ◽  
Fahad M. Alhoshani ◽  
...  

Lin-28 is an RNA-binding protein that is known for its role in promoting the pluripotency of stem cells. In the present study, Arabian camel Lin-28 (cLin-28) cDNA was identified and analyzed. Full length cLin-28 mRNA was obtained using the reverse transcription polymerase chain reaction (RT-PCR). It was shown to be 715 bp in length, and the open reading frame (ORF) encoded 205 amino acids. The molecular weight and theoretical isoelectric point (pI) of the cLin-28 protein were predicted to be 22.389 kDa and 8.50, respectively. Results from the bioinformatics analysis revealed that cLin-28 has two main domains: an N-terminal cold-shock domain (CSD) and a C-terminal pair of retroviral-type Cysteine3Histidine (CCHC) zinc fingers. Sequence similarity and phylogenetic analysis showed that the cLin-28 protein is grouped together Camelus bactrianus and Bos taurus. Quantitative real-time PCR (qPCR) analysis showed that cLin-28 mRNA is highly expressed in the lung, heart, liver, and esophageal tissues. Peptide mass fingerprint-mass spectrometry (PMF-MS) analysis of the purified cLin-28 protein confirmed the identity of this protein. Comparing the modeled 3D structure of cLin-28 protein with the available protein 3D structure of the human Lin-28 protein confirmed the presence of CSD and retroviral-type CCHC zinc fingers, and high similarities were noted between the two structures by using super secondary structure prediction.


2012 ◽  
Vol 09 ◽  
pp. 143-156 ◽  
Author(s):  
ZAKARIA N. MAHMOOD ◽  
MASSUDI MAHMUDDIN ◽  
MOHAMMED NOORALDEEN MAHMOOD

Encoding proteins of amino acid sequence to predict classified into their respective families and subfamilies is important research area. However for a given protein, knowing the exact action whether hormonal, enzymatic, transmembranal or nuclear receptors does not depend solely on amino acid sequence but on the way the amino acid thread folds as well. This study provides a prototype system that able to predict a protein tertiary structure. Several methods are used to develop and evaluate the system to produce better accuracy in protein 3D structure prediction. The Bees Optimization algorithm which inspired from the honey bees food foraging method, is used in the searching phase. In this study, the experiment is conducted on short sequence proteins that have been used by the previous researches using well-known tools. The proposed approach shows a promising result.


Sign in / Sign up

Export Citation Format

Share Document