structure database
Recently Published Documents


TOTAL DOCUMENTS

238
(FIVE YEARS 46)

H-INDEX

35
(FIVE YEARS 6)

2022 ◽  
Author(s):  
Qiongqiong Feng ◽  
Minghua Hou ◽  
Jun Liu ◽  
Kailong Zhao ◽  
Guijun Zhang

Although remarkable achievements, such as AlphaFold2, have been made in end-to-end structure prediction, fragment libraries remain essential for de novo protein structure prediction, which can help explore and understand the protein-folding mechanism. In this work, we developed a variable-length fragment library (VFlib). In VFlib, a master structure database was first constructed from the Protein Data Bank through sequence clustering. The Hidden Markov Model (HMM) profile of each protein in the master structure database was generated by HHsuite, and the secondary structure of each protein was calculated by DSSP. For the query sequence, the HMM-profile was first constructed. Then, variable-length fragments were retrieved from the master structure database through dynamically variable-length profile-profile comparison. A complete method for chopping the query HMM-profile during this process was proposed to obtain fragments with increased diversity. Finally, secondary structure information was used to further screen the retrieved fragments to generate the final fragment library of specific query sequence. The experimental results obtained with a set of 120 nonredundant proteins showed that the global precision and coverage of the fragment library generated by VFlib were 55.04% and 94.95% at the RMSD cutoff of 1.5 Å, respectively. Compared to the benchmark method of NNMake, the global precision of our fragment library had increased by 62.89% with equivalent coverage. Furthermore, the fragments generated by VFlib and NNMake were used to predict structure models through fragment assembly. Controlled experimental results demonstrated that the average TM-score of VFlib was 16.00% higher than that of NNMake.


2021 ◽  
Vol 50 (D1) ◽  
pp. D1-D10
Author(s):  
Daniel J Rigden ◽  
Xosé M Fernández

Abstract The 2022 Nucleic Acids Research Database Issue contains 185 papers, including 87 papers reporting on new databases and 85 updates from resources previously published in the Issue. Thirteen additional manuscripts provide updates on databases most recently published elsewhere. Seven new databases focus specifically on COVID-19 and SARS-CoV-2, including SCoV2-MD, the first of the Issue's Breakthrough Articles. Major nucleic acid databases reporting updates include MODOMICS, JASPAR and miRTarBase. The AlphaFold Protein Structure Database, described in the second Breakthrough Article, is the stand-out in the protein section, where the Human Proteoform Atlas and GproteinDb are other notable new arrivals. Updates from DisProt, FuzDB and ELM comprehensively cover disordered proteins. Under the metabolism and signalling section Reactome, ConsensusPathDB, HMDB and CAZy are major returning resources. In microbial and viral genomes taxonomy and systematics are well covered by LPSN, TYGS and GTDB. Genomics resources include Ensembl, Ensembl Genomes and UCSC Genome Browser. Major returning pharmacology resource names include the IUPHAR/BPS guide and the Therapeutic Target Database. New plant databases include PlantGSAD for gene lists and qPTMplants for post-translational modifications. The entire Database Issue is freely available online on the Nucleic Acids Research website (https://academic.oup.com/nar). Our latest update to the NAR online Molecular Biology Database Collection brings the total number of entries to 1645. Following last year's major cleanup, we have updated 317 entries, listing 89 new resources and trimming 80 discontinued URLs. The current release is available at http://www.oxfordjournals.org/nar/database/c/.


2021 ◽  
Author(s):  
Lucas Aleixo Leal Pedroza ◽  
Francisco Agenor de Oliveira Neto ◽  
Antonio Marinho da Silva Neto, ◽  
Carlos Henrique Madeiros Castelletti ◽  
Priscila Gubert

Introdução: A TDP-43 (Target DNA protein) é uma proteína de 414 aminoácidos contendo 2 domínios de ligação ao RNA (RRM1 e RRM2), que vão do aminoácido 101 ao 265, uma região N-terminal (resíduo 1 ao 100) e uma região rica em Glicina (266 - 414), que em condições fisiológicas possui papel fundamental no metabolismo de RNAs e formação e manutenção dos grânulos de estresse. Entretanto, em condições patológicas, ainda pouco compreendidas, esta proteína pode formar agregados citotóxicos em células neuronais, causando danos mitocondriais, no proteossoma, e levando à neurodegeneração. Estes fatores tornam este quadro de proteinopatia a marca histopatológica de doenças como esclerose lateral amiotrífica (ELA) e demência fronto-temporal (DFT) [1]. Objetivos: Analisar o potencial de agregação dos resíduos da TDP-43 a partir de ferramentas computacionais. Metodologia: A estrutura tridimensional da TDP-43 foi obtida no banco de dados do AlphaFold protein structure database [2] com código AF-Q13148-F1-model_v1. A análise do potencial de agregação foi avaliado pelo Aggrescan3D 2.0, que permite analisar o potencial de agregação de aminoácidos a partir de uma estrutura conformacional proteíca [3] Resultados: O domínio rico em glicina (266 – 414) apresentou mais resíduos com alto potencial de formar agregados (50 aminoácidos) com índice acima de 0.000 (valores positivos), enquanto os demais domínios somados apresentaram apenas 12 aminoácidos com este potencial, sendo 7 destes referentes ao NTD (score máximo de 1.1713 em V94) e 5 nos RRMs (score máximo de 1.0120 em I249) de acordo com o score de pontuação do Aggrescan3D. Dentre aqueles que mais pontuaram tem-se a fenilalanina 316 (2.1992) e a isoleucina 383 (2.0641). De acordo com dados da literatura, a região rica em glicina está diretamente relacionada com a interação da TDP-43 com demais estruturas citoplasmáticas, inclusive com a formação de agregados, especialmente nas regiões de grânulos de estresse, fator esse que provavelmente ocorre em função da alta flexibilidade que a glicina confere ao domínio. Apesar da ausência de dados, é esperado que os RRM demonstrem scores menores, visto o seu papel no metabolismo de RNAs. Conclusão: Nota-se então que a região rica em glicina da TDP-43 apresenta mais resíduos com potencial de formar agregados citotóxicos, quando comparados aos demais domínios, tornando esta região, um possível alvo farmacológico para a inibição do avanço da proteinopatia. Ademais, novos estudos estão sendo realizados pelo grupo, a fim de compreender melhor as implicações da flexibilidade do domínio rico em glicina no potencial de agregação dos resíduos adjacentes.


2021 ◽  
Author(s):  
Ada Y. Chen ◽  
Juyong Lee ◽  
Ana Damjanovic ◽  
Bernard R. Brooks

We present four tree-based machine learning models for protein pKa prediction. The four models, Random Forest, Extra Trees, eXtreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM), were trained on three experimental PDB and pKa datasets, two of which included a notable portion of internal residues. We observed similar performance among the four machine learning algorithms. The best model trained on the largest dataset performs 37% better than the widely used empirical pKa prediction tool PROPKA. The overall RMSE for this model is 0.69, with surface and buried RMSE values being 0.56 and 0.78, respectively, considering six residue types (Asp, Glu, His, Lys, Cys and Tyr), and 0.63 when considering Asp, Glu, His and Lys only. We provide pKa predictions for proteins in human proteome from the AlphaFold Protein Structure Database and observed that 1% of Asp/Glu/Lys residues have highly shifted pKa values close to the physiological pH.


2021 ◽  
Author(s):  
Maarten L Hekkelman ◽  
Ida de de Vries ◽  
Robbie P Joosten ◽  
Anastassis Perrakis

Artificial intelligence (AI) methods for constructing structural models of proteins on the basis of their sequence are having a transformative effect in biomolecular sciences. The AlphaFold protein structure database makes available hundreds of thousands of protein structures. However, all these structures lack cofactors essential for their structural integrity and molecular function (e.g. hemoglobin lacks a bound heme), key ions essential for structural integrity (e.g. zinc-finger motifs) or catalysis (e.g. Ca2+ or Zn2+ in metalloproteases), and ligands that are important for biological function (e.g. kinase structures lack ADP or ATP). Here, we present AlphaFill, an algorithm based on sequence and structure similarity, to "transplant" such "missing" small molecules and ions from experimentally determined structures to predicted protein models. These publicly available structural annotations are mapped to predicted protein models, to help scientists interpret biological function and design experiments.


2021 ◽  
Vol 2 (1) ◽  
pp. 39-47
Author(s):  
M Farid Khoirul Alim ◽  
Hartatiek Hartatiek ◽  
Chusnana Insjaf Yogihati

Perkembangan Ilmu Pengetahuan dan Teknologi (IPTEK) akhir-akhir ini mendorong banyaknya inovasi dalam dunia medis terutama penggunaan biomaterial sebagai implan pengganti tulang dan gigi, salah satunya bahan tersebut adalah biokeramik komposit CaO-TiO2. Bahan biokeramik komposit CaO-TiO2 dapat digunakan untuk memperbaiki bagian tubuh yang rusak terutama sebagai implan gigi, penyambung tulang, struktur penahan katup jantung, dan pengganti tulang tengkorak. Paduan antara CaO-TiO2 memiliki beberapa keuntungan diantaranya memiliki biokompatibilitas yang baik, dapat tumbuh serta berkembang bersama-sama dengan tulang asli serta memiliki ketahanan mekanik yang baik. Berdasarkan paparan di atas, tujuan dari penelitian ini adalah mengetahui pengaruh lama maturasi pada biokeramik komposit CaO-TiO2 dengan metode kopresipitasi terhadap kristalinitas, mikrostruktur, dan kekerasan. Pada penelitian ini bahan dasar yang digunakan adalah CaO yang berasal dari batuan kapur alam yang diambil dari pantai Balekambang Kabupaten Malang dan TiO2 dengan kemurnian 99 persen. Sampel dilarutkan dalam aquades dan distirer selama 15 jam pada suhu 70 derajat celcius. Lama maturasi divariasi mulai dari 12, 24, 36, 48, dan 60 jam, dianneling pada suhu 100 derajat celcius selama 24 jam dan disintering selama 4 jam pada suhu 1100 derajat celcius. Sampel dikarakterisasi ukuran kristal, mikrostruktur, dan kekerasan, dengan menggunakan XRD, SEM, dan Micro Vickers Hardness. Hasil analisis CaO-TiO2 menunjukkan kecocokan dan keberhasilan sintesis dengan model pembanding CaO-TiO2 dari Inorganic Crystal Structure Database (ICSD) dengan nilai score diatas 50. Berdasarkan perhitungan teoritik yang dilakukan dengan menentukan nilai FWHM (Full Widht at Half Maximum) dari pola difraksi sampel yang kemudian digunakan pada formula scherrer, diperoleh hasil peningkatan ukuran kristal yang bervariasi terhadap lama maturasi komposit CaO-TiO2 dengan besar antara 45,06 nm-70,85 nm. Dengan meningkatnya ukuran kristal terhadap lama maturasi maka akan disertai oleh peningkatan ukuran butir, sehingga semakin sedikit jumlah pori-pori yang terbentuk pada bahan yang ditunjukan oleh menurunnya nilai luas fraksi pori sebesar 4,97 persen pada lama maturasi 12 jam menjadi 4,79 persen pada lama maturasi 60 jam. Dengan semakin kecilnya nilai fraksi total pori maka semakin besar kekerasan dari bahan tersebut, hal ini ditunjukan dengan nilai kekerasan tertinggi diperoleh pada lama maturasi 60 jam sebesar 497,2 MPa.


2021 ◽  
Vol 72 (05) ◽  
pp. 491-502
Author(s):  
LINLIN BAI ◽  
JIU ZHOU

Weft-backed structures with compound weft colours can express the mixed colour effect. However, this structure is not suitable for jacquard fabrics with a double-faced shading effect in the traditional single layer design mode. Taking twenty-thread sateen with a step number (S) of 7 as an example, this paper investigates a design method for compound full-backed structure with three shaded-weave databases (SWDs) by selecting the primary weaves (PWs), designing the compound full-backed technical points and establishing the compound structure database with three SWDs. With this design method, a double-faced shading effect in combination with non-backed and full-backed effects on different sides of the jacquard fabric at the same position is generated. The fabric colour card was produced with three SWDs and three sets of different coloured wefts, and their colour values were measured, followed by an analysis of the compound structures on the reverse side, lightness, colour purity and colour difference (DE*ab) of the specimens. The results showed that the three covering effects on the reverse side, partly covered, critical position and totally covered, could be adjusted by controlling the step number and the transition direction of PW-C. For the specimens on the edges of the fabric colour card, their lightness and colour purity values showed a uniform transition effect along with the shading process; their colour differences ranged from 1.23 to 3.69, both in the range of 2–5, and showed a trace or slight colour difference between two adjacent fabric specimens, indicating that the colour shading effect with the three SWDs is stable.


2021 ◽  
Author(s):  
Chunxiang Peng ◽  
Xiaogen Zhou ◽  
Yuhao Xia ◽  
Yang Zhang ◽  
Guijun Zhang

With the development of protein structure prediction methods and biological experimental determination techniques, the structure of single-domain proteins can be relatively easier to be modeled or experimentally solved. However, more than 80% of eukaryotic proteins and 67% of prokaryotic proteins contain multiple domains. Constructing a unified multi-domain protein structure database will promote the research of multi-domain proteins, especially in the modeling of multi-domain protein structures. In this work, we develop a unified multi-domain protein structure database (MPDB). Based on MPDB, we also develop a server with two functional modules: (1) the culling module, which filters the whole MPDB according to input criteria; (2) the detection module, which identifies structural analogues of the full-chain according to the structural similarity between input domain models and the protein in MPDB. The module can discover the potential analogue structures, which will contribute to high-quality multi-domain protein structure modeling.


Sign in / Sign up

Export Citation Format

Share Document