Emerging SARS-CoV-2 variants follow a historical pattern recorded in outgroups infecting non-human hosts

Kazutaka Katoh; Daron M. Standley

doi:10.1038/s42003-021-02663-4

Emerging SARS-CoV-2 variants follow a historical pattern recorded in outgroups infecting non-human hosts

Communications Biology ◽

10.1038/s42003-021-02663-4 ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Kazutaka Katoh ◽

Daron M. Standley

Keyword(s):

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Multiple Sequence ◽

Historical Pattern ◽

Close Relatives ◽

High Diversity ◽

S Proteins

AbstractThe ability to predict emerging variants of SARS-CoV-2 would be of enormous value, as it would enable proactive design of vaccines in advance of such emergence. We estimated diversity of each site on a multiple sequence alignment (MSA) of the Spike (S) proteins from close relatives of SARS-CoV-2 that infected bat and pangolin before the pandemic. Then we compared the locations of high diversity sites in this MSA and those of mutations found in multiple emerging lineages of human-infecting SARS-CoV-2. This comparison revealed a significant correspondence, which suggests that a limited number of sites in this protein are repeatedly substituted in different lineages of this group of viruses. It follows, therefore, that the sites of future emerging mutations in SARS-CoV-2 can be predicted by analyzing their relatives (outgroups) that have infected non-human hosts. We discuss a possible evolutionary basis for these substitutions and provide a list of frequently substituted sites that potentially include future emerging variants in SARS-CoV-2.

Download Full-text

Multiple Sequence Alignment and Profile Analysis of Protein Family Utsing Hidden Markov Model

International Journal of Scientific Research ◽

10.15373/22778179/june2013/66 ◽

2012 ◽

Vol 2 (6) ◽

pp. 208-211

Author(s):

Navjot Kaur ◽

◽

Rajbir Singh Cheema ◽

Harmandeep Singh Harmandeep Singh

Keyword(s):

Markov Model ◽

Hidden Markov Model ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Profile Analysis ◽

Hidden Markov ◽

Protein Family ◽

Multiple Sequence

Download Full-text

Faculty Opinions recommendation of MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.731078852.793536612 ◽

2017 ◽

Author(s):

Feng Gao

Keyword(s):

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Online Service ◽

Multiple Sequence

Download Full-text

Computational Analysis of Therapeutic Enzyme Uricase from Different Source Organisms

Current Proteomics ◽

10.2174/1570164616666190617165107 ◽

2020 ◽

Vol 17 (1) ◽

pp. 59-77

Author(s):

Anand Kumar Nelapati ◽

JagadeeshBabu PonnanEttiyappan

Keyword(s):

Uric Acid ◽

Amino Acid ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Protein Sequences ◽

Amino Acid Sequences ◽

Amino Acid Residues ◽

Multiple Sequence ◽

Physiochemical Properties ◽

Pharmaceutical Industries

Background:Hyperuricemia and gout are the conditions, which is a response of accumulation of uric acid in the blood and urine. Uric acid is the product of purine metabolic pathway in humans. Uricase is a therapeutic enzyme that can enzymatically reduces the concentration of uric acid in serum and urine into more a soluble allantoin. Uricases are widely available in several sources like bacteria, fungi, yeast, plants and animals.Objective:The present study is aimed at elucidating the structure and physiochemical properties of uricase by insilico analysis.Methods:A total number of sixty amino acid sequences of uricase belongs to different sources were obtained from NCBI and different analysis like Multiple Sequence Alignment (MSA), homology search, phylogenetic relation, motif search, domain architecture and physiochemical properties including pI, EC, Ai, Ii, and were performed.Results:Multiple sequence alignment of all the selected protein sequences has exhibited distinct difference between bacterial, fungal, plant and animal sources based on the position-specific existence of conserved amino acid residues. The maximum homology of all the selected protein sequences is between 51-388. In singular category, homology is between 16-337 for bacterial uricase, 14-339 for fungal uricase, 12-317 for plants uricase, and 37-361 for animals uricase. The phylogenetic tree constructed based on the amino acid sequences disclosed clusters indicating that uricase is from different source. The physiochemical features revealed that the uricase amino acid residues are in between 300- 338 with a molecular weight as 33-39kDa and theoretical pI ranging from 4.95-8.88. The amino acid composition results showed that valine amino acid has a high average frequency of 8.79 percentage compared to different amino acids in all analyzed species.Conclusion:In the area of bioinformatics field, this work might be informative and a stepping-stone to other researchers to get an idea about the physicochemical features, evolutionary history and structural motifs of uricase that can be widely used in biotechnological and pharmaceutical industries. Therefore, the proposed in silico analysis can be considered for protein engineering work, as well as for gout therapy.

Download Full-text

LegumeDB: Development of Legume Medicinal Plant Database and Comparative Molecular Evolutionary Analysis of matK Proteins of Legumes and Mangroves

Current Nutrition & Food Science ◽

10.2174/1573401314666180223143523 ◽

2019 ◽

Vol 15 (4) ◽

pp. 353-362

Author(s):

Sambhaji B. Thakar ◽

Maruti J. Dhanavade ◽

Kailas D. Sonawane

Keyword(s):

Phylogenetic Analysis ◽

Medicinal Plants ◽

Homology Modeling ◽

Sequence Alignment ◽

Vigna Unguiculata ◽

Multiple Sequence Alignment ◽

Legume Species ◽

Mangrove Species ◽

Multiple Sequence ◽

Thespesia Populnea

Background: Legume plants are known for their rich medicinal and nutritional values. Large amount of medicinal information of various legume plants have been dispersed in the form of text. Objective: It is essential to design and construct a legume medicinal plants database, which integrate respective classes of legumes and include knowledge regarding medicinal applications along with their protein/enzyme sequences. Methods: The design and development of Legume Medicinal Plants Database (LegumeDB) has been done by using Microsoft Structure Query Language Server 2017. DBMS was used as back end and ASP.Net was used to lay out front end operations. VB.Net was used as arranged program for coding. Multiple sequence alignment, phylogenetic analysis and homology modeling techniques were also used. Results: This database includes information of 50 Legume medicinal species, which might be helpful to explore the information for researchers. Further, maturase K (matK) protein sequences of legumes and mangroves were retrieved from NCBI for multiple sequence alignment and phylogenetic analysis to understand evolutionary lineage between legumes and mangroves. Homology modeling technique was used to determine three-dimensional structure of matK from Legume species i.e. Vigna unguiculata using matK of mangrove species, Thespesia populnea as a template. The matK sequence analysis results indicate the conserved residues among legume and mangrove species. Conclusion: Phylogenetic analysis revealed closeness between legume species Vigna unguiculata and mangrove species Thespesia populnea to each other, indicating their similarity and origin from common ancestor. Thus, these studies might be helpful to understand evolutionary relationship between legumes and mangroves. : LegumeDB availability: http://legumedatabase.co.in

Download Full-text

A Hierarchical Classification for the Selection of the Most Suitable Multiple Sequence Alignment Methodology

Current Bioinformatics ◽

10.2174/157489361002150518145112 ◽

2015 ◽

Vol 10 (2) ◽

pp. 199-207

Author(s):

Francisco Ortuño ◽

Hector Pomares ◽

Olga Valenzuela ◽

Carolina Torres ◽

Ignacio Rojas

Keyword(s):

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Hierarchical Classification ◽

Multiple Sequence ◽

Selection Of

Download Full-text

Reference Alignment Based Methods for Quality Evaluation of Multiple Sequence Alignment - A Survey

Current Bioinformatics ◽

10.2174/15748936113089990005 ◽

2013 ◽

Vol 8 (999) ◽

pp. 1-6

Author(s):

Pawel Wojciechowski ◽

Piotr Formanowicz ◽

Jacek Blazewicz

Keyword(s):

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Quality Evaluation ◽

Reference Alignment ◽

Multiple Sequence

Download Full-text

Amino Acids Sequence-based Analysis of Arginine Deiminase from Different Prokaryotic Organisms: An In Silico Approach

Recent Patents on Biotechnology ◽

10.2174/1872208314666200324114441 ◽

2020 ◽

Vol 14 (3) ◽

pp. 235-246

Author(s):

Sara Abdollahi ◽

Mohammad H. Morowvat ◽

Amir Savardashtaki ◽

Cambyz Irajie ◽

Sohrab Najafipour ◽

...

Keyword(s):

Amino Acid ◽

Physicochemical Properties ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

In Silico ◽

Bacterial Species ◽

Arginine Deiminase ◽

Antitumor Effects ◽

Multiple Sequence ◽

In Silico Studies

Background: Arginine deiminase is a bacterial enzyme, which degrades L-arginine. Some human cancers such as hepatocellular carcinoma (HCC) and melanoma are auxotrophic for arginine. Therefore, PEGylated arginine deiminase (ADI-PEG20) is a good anticancer candidate with antitumor effects. It causes local depletion of L-arginine and growth inhibition in arginineauxotrophic tumor cells. The FDA and EMA have granted orphan status to this drug. Some recently published patents have dealt with this enzyme or its PEGylated form. Objective: Due to increasing attention to it, we aimed to evaluate and compare 30 arginine deiminase proteins from different bacterial species through in silico analysis. Methods: The exploited analyses included the investigation of physicochemical properties, multiple sequence alignment (MSA), motif, superfamily, phylogenetic and 3D comparative analyses of arginine deiminase proteins thorough various bioinformatics tools. Results: The most abundant amino acid in the arginine deiminase proteins is leucine (10.13%) while the least amino acid ratio is cysteine (0.98%). Multiple sequence alignment showed 47 conserved patterns between 30 arginine deiminase amino acid sequences. The results of sequence homology among 30 different groups of arginine deiminase enzymes revealed that all the studied sequences located in amidinotransferase superfamily. Based on the phylogenetic analysis, two major clusters were identified. Considering the results of various in silico studies; we selected the five best candidates for further investigations. The 3D structures of the best five arginine deiminase proteins were generated by the I-TASSER server and PyMOL. The RAMPAGE analysis revealed that 81.4%-91.4%, of the selected sequences, were located in the favored region of arginine deiminase proteins. Conclusion: The results of this study shed light on the basic physicochemical properties of thirty major arginine deiminase sequences. The obtained data could be employed for further in vivo and clinical studies and also for developing the related therapeutic enzymes.

Download Full-text

A data-centric pipeline using convolutional neural network to select better multiple sequence alignment method

Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics ◽

10.1145/3388440.3414909 ◽

2020 ◽

Author(s):

Mengmeng Kuang ◽

Hing-fung Ting

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Alignment Method ◽

Multiple Sequence

Download Full-text

Novel missense mutation of SASH1 in a Chinese family with dyschromatosis universalis hereditaria

BMC Medical Genomics ◽

10.1186/s12920-021-01014-w ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Lu Cao ◽

Ruixue Zhang ◽

Liang Yong ◽

Shirui Chen ◽

Hui Zhang ◽

...

Keyword(s):

Missense Mutation ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Molecular Genetic ◽

Family Members ◽

Clinical Manifestations ◽

Gene Mutations ◽

Mendelian Inheritance ◽

Chinese Family ◽

Multiple Sequence

Abstract Background Dyschromatosis universalis hereditaria (DUH) is a pigmentary dermatosis characterized by generalized mottled macules with hypopigmention and hyperpigmention. ABCB6 and SASH1 are recently reported pathogenic genes related to DUH, and the aim of this study was to identify the causative mutations in a Chinese family with DUH. Methods Sanger sequencing was performed to investigate the clinical manifestation and molecular genetic basis of these familial cases of DUH, bioinformatics tools and multiple sequence alignment were used to analyse the pathogenicity of mutations. Results A novel missense mutation, c.1529G>A, in the SASH1 gene was identified, and this mutation was not found in the National Center for Biotechnology Information Database of Short Genetic Variation, Online Mendelian Inheritance in Man, ClinVar, or 1000 Genomes Project databases. All in silico predictors suggested that the observed substitution mutation was deleterious. Furthermore, multiple sequence alignment of SASH1 revealed that the p.S510N mutation was highly conserved during evolution. In addition, we reviewed the previously reported DUH-related gene mutations in SASH1 and ABCB6. Conclusion Although the affected family members had identical mutations, differences in the clinical manifestations of these family members were observed, which reveals the complexity of the phenotype-influencing factors in DUH. Our findings reveal the mutation responsible for DUH in this family and broaden the mutational spectrum of the SASH1 gene.

Download Full-text

A hybrid genetic algorithm with chemical reaction optimization for multiple sequence alignment.

2019 22nd International Conference on Computer and Information Technology (ICCIT) ◽

10.1109/iccit48885.2019.9038510 ◽

2019 ◽

Author(s):

Sajib Chatterjee ◽

Promal barua ◽

M. M. Hasibuzzaman ◽

Afrin Iftiea ◽

Tarpan Mukharjee ◽

...

Keyword(s):

Genetic Algorithm ◽

Chemical Reaction ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Hybrid Genetic Algorithm ◽

Chemical Reaction Optimization ◽

Multiple Sequence ◽

Reaction Optimization

Download Full-text