scholarly journals Semi-Supervised Pipeline for Autonomous Annotation of SARS-CoV-2 Genomes

Viruses ◽  
2021 ◽  
Vol 13 (12) ◽  
pp. 2426
Author(s):  
Kristen L. Beck ◽  
Edward Seabolt ◽  
Akshay Agarwal ◽  
Gowri Nayar ◽  
Simone Bianco ◽  
...  

SARS-CoV-2 genomic sequencing efforts have scaled dramatically to address the current global pandemic and aid public health. However, autonomous genome annotation of SARS-CoV-2 genes, proteins, and domains is not readily accomplished by existing methods and results in missing or incorrect sequences. To overcome this limitation, we developed a novel semi-supervised pipeline for automated gene, protein, and functional domain annotation of SARS-CoV-2 genomes that differentiates itself by not relying on the use of a single reference genome and by overcoming atypical genomic traits that challenge traditional bioinformatic methods. We analyzed an initial corpus of 66,000 SARS-CoV-2 genome sequences collected from labs across the world using our method and identified the comprehensive set of known proteins with 98.5% set membership accuracy and 99.1% accuracy in length prediction, compared to proteome references, including Replicase polyprotein 1ab (with its transcriptional slippage site). Compared to other published tools, such as Prokka (base) and VAPiD, we yielded a 6.4- and 1.8-fold increase in protein annotations. Our method generated 13,000,000 gene, protein, and domain sequences—some conserved across time and geography and others representing emerging variants. We observed 3362 non-redundant sequences per protein on average within this corpus and described key D614G and N501Y variants spatiotemporally in the initial genome corpus. For spike glycoprotein domains, we achieved greater than 97.9% sequence identity to references and characterized receptor binding domain variants. We further demonstrated the robustness and extensibility of our method on an additional 4000 variant diverse genomes containing all named variants of concern and interest as of August 2021. In this cohort, we successfully identified all keystone spike glycoprotein mutations in our predicted protein sequences with greater than 99% accuracy as well as demonstrating high accuracy of the protein and domain annotations. This work comprehensively presents the molecular targets to refine biomedical interventions for SARS-CoV-2 with a scalable, high-accuracy method to analyze newly sequenced infections as they arise.

Author(s):  
Angelo Spinello ◽  
Andrea Saltalamacchia ◽  
Alessandra Magistrato

<p>The latest outbreak of a new pathogenic coronavirus (SARS-CoV-2) is provoking a global health, economic and societal crisis. All-atom simulations enabled us to uncover the key molecular traits underlying the high affinity of SARS-CoV-2 spike glycoprotein towards its human receptor, providing a rationale to its high infectivity. Harnessing this knowledge can boost developing effective medical countermeasures to fight the current global pandemic.</p>


2020 ◽  
Vol 17 ◽  
Author(s):  
Ajoy Basak ◽  
Sarmistha Basak

: The current global pandemic outbreak of a novel type of corona virus termed by World Health Organization as COVID-19 became an grave concern and worry to human health and world economy. Intense research efforts are now underway worldwide to combat and prevent the spread of this deadly disease. This zoonotic virus, a native to bat population is most likely transmitted to human via a host reservoir. Due to its close similarity to previously known SARS CoV (Severe Acute Respiratory Syndrome Corona Virus) of 2002 and related MERS CoV (Middle East Respiratory Syndrome Corona Virus) of 2012, it is also known as SARS CoV2. But unlike them it is far too infectious, virulent and lethal. Among its various proteins, the surface spike glycoprotein “S” has drawn significant attention because of its implication in viral recognition and host-virus fusion process. A detail comparative analysis of “S” proteins of SARS CoV (now called SARS CoV1), SARS CoV2 (COVID-19) and MERS CoV based on structure, sequence alignment, host cleavage sites, receptor binding domains, potential glycosylation and Cys-disulphide bridge locations has been performed. It revealed some key features and variations that may elucidate the high infection and virulence character of COVID-19. Moreover this crucial information may become useful in our quest for COVID-19 therapeutics and vaccines.


2018 ◽  
Author(s):  
Moritz Schaefer ◽  
Dr. Djork-Arné Clevert ◽  
Dr. Bertram Weiss ◽  
Dr. Andreas Steffen

AbstractSummary: sgRNAs targeting the same gene can significantly vary in terms of efficacy and specificity. PAVOOC (Prediction And Visualization of On- and Off-targets for CRISPR) is a web-based CRISPR sgRNA design tool that employs state-of-the art machine learning models to prioritize most effective candidate sgRNAs. In contrast to other tools, it maps sgRNAs to functional domains and protein structures and visualizes cut sites on corresponding protein crystal structures. Furthermore, PAVOOC supports HDR template generation for gene editing experiments and the visualization of the mutated amino acids in 3D.Availability and Implementation: PAVOOC is available under https://pavooc.me and accessible using current browsers (Chrome/Chromium recommended). The source code is hosted at github.com/moritzschaefer/pavooc under the MIT License. The backend, including data processing steps, and the frontend is implemented in Python 3 and ReactJS respectively. All components run in a simple Docker environment.Contact: [email protected]


Author(s):  
Kevin R. McCarthy ◽  
Linda J. Rennick ◽  
Sham Nambulli ◽  
Lindsey R. Robinson-McCarthy ◽  
William G. Bain ◽  
...  

AbstractZoonotic pandemics follow the spillover of animal viruses into highly susceptible human populations. Often, pandemics wane, becoming endemic pathogens. Sustained circulation requires evasion of protective immunity elicited by previous infections. The emergence of SARS-CoV-2 has initiated a global pandemic. Since coronaviruses have a lower substitution rate than other RNA viruses this gave hope that spike glycoprotein is an antigenically stable vaccine target. However, we describe an evolutionary pattern of recurrent deletions at four antigenic sites in the spike glycoprotein. Deletions abolish binding of a reported neutralizing antibody. Circulating SARS-CoV-2 variants are continually exploring genetic and antigenic space via deletion in individual patients and at global scales. In viruses where substitutions are relatively infrequent, deletions represent a mechanism to drive rapid evolution, potentially promoting antigenic drift.


2019 ◽  
Vol 20 (5) ◽  
pp. 389-399
Author(s):  
Wangren Qiu ◽  
Chunhui Xu ◽  
Xuan Xiao ◽  
Dong Xu

Background: Ubiquitination, as a post-translational modification, is a crucial biological process in cell signaling, apoptosis, and localization. Identification of ubiquitination proteins is of fundamental importance for understanding the molecular mechanisms in biological systems and diseases. Although high-throughput experimental studies using mass spectrometry have identified many ubiquitination proteins and ubiquitination sites, the vast majority of ubiquitination proteins remain undiscovered, even in well-studied model organisms. Objective: To reduce experimental costs, computational methods have been introduced to predict ubiquitination sites, but the accuracy is unsatisfactory. If it can be predicted whether a protein can be ubiquitinated or not, it will help in predicting ubiquitination sites. However, all the computational methods so far can only predict ubiquitination sites. Methods: In this study, the first computational method for predicting ubiquitination proteins without relying on ubiquitination site prediction has been developed. The method extracts features from sequence conservation information through a grey system model, as well as functional domain annotation and subcellular localization. Results: Together with the feature analysis and application of the relief feature selection algorithm, the results of 5-fold cross-validation on three datasets achieved a high accuracy of 90.13%, with Matthew’s correlation coefficient of 80.34%. The predicted results on an independent test data achieved 87.71% as accuracy and 75.43% of Matthew’s correlation coefficient, better than the prediction from the best ubiquitination site prediction tool available. Conclusion: Our study may guide experimental design and provide useful insights for studying the mechanisms and modulation of ubiquitination pathways. The code is available at: https://github.com/Chunhuixu/UBIPredic_QWRCHX.


2020 ◽  
Author(s):  
Angelo Spinello ◽  
Andrea Saltalamacchia ◽  
Alessandra Magistrato

<p>The latest outbreak of a new pathogenic coronavirus (SARS-CoV-2) is provoking a global health, economic and societal crisis. All-atom simulations enabled us to uncover the key molecular traits underlying the high affinity of SARS-CoV-2 spike glycoprotein towards its human receptor, providing a rationale to its high infectivity. Harnessing this knowledge can boost developing effective medical countermeasures to fight the current global pandemic.</p>


2021 ◽  
pp. 13-15
Author(s):  
V. Radha Lakshmi ◽  
K. Anusha Reddy

Introduction: Corona virus disease-19 (COVID-19), produced by severe acute respiratory syndrome Corona virus 2 (SARS-CoV2), has become a global pandemic, giving rise to a serious health threat globally. In India we have seen a two wave pattern of reported cases with peak of rst wave in September 2020 and peak of second wave in May 2021.Women undergoing pregnancy and those at the time of child birth and puerperium constitute potentially vulnerable populations for covid-19. Aims And Objectives: To evaluate differences in clinical presentation, co-morbidities, pregnancy complications and outcomes in women with covid-19 during rst wave and second wave of covid-19 pandemic. Materials And Methods: We conducted a retrospective observational cohort study of all hospitalized pregnant and postpartum woman with SARS-CoV2 infection in Government General Hospital, Kurnool. All the patients admitted from 1st May to 31st October 2020 were considered to be in the rst wave and those admitted from 1st April to 31st June were considered to be in second wave. Results: Incidence of cases has increased from 14.18 to 16.8%.There was two fold increase in the symptomatic cases from 4.2 to 8%patients in the second wave were younger in the age group of 16-25yrs.The number of pregnant women delivered by Caesarean section have increased from 57.5 %to 61.1 %.ICU admissions have signicantly increased from 2.7% to 3.1% Case fatality rate has increased from 0.4%-1.1%. As observed from the above results there is higher frequency of severe Covid 19,increased ICU ad Conclusion: missions and maternal deaths in second wave of Covid 19 pandemic as compared to the rst wave .Although the exact causes of increase in severity and mortality are unknown ,but probably due to emergence of most pathological strains of SARS-Co2.


Author(s):  
Wang-Ren Qiu ◽  
Ao Xu ◽  
Zhao-Chun Xu ◽  
Chun-Hua Zhang ◽  
Xuan Xiao

2020 ◽  
Author(s):  
Angelo Spinello ◽  
Andrea Saltalamacchia ◽  
Alessandra Magistrato

<p>The latest outbreak of a new pathogenic coronavirus (SARS-CoV-2) is provoking a global health, economic and societal crisis. All-atom simulations enabled us to uncover the key molecular traits underlying the high affinity of SARS-CoV-2 spike glycoprotein towards its human receptor, providing a rationale to its high infectivity. Harnessing this knowledge can boost developing effective medical countermeasures to fight the current global pandemic.</p>


2020 ◽  
Author(s):  
William R. Martin ◽  
Feixiong Cheng

The global pandemic of Coronavirus Disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has led to the death of more than 350,000 worldwide and over 100,000 in the United States alone. However, there are currently no proven effective pharmacotherapies for COVID-19. Here, we combine homology modeling, molecular docking, molecular dynamics simulation, and binding affinity calculations to determine potential targets for toremifene, a selective estrogen receptor modulator which we have previously identified as a SARS-CoV-2 inhibitor. Our results indicate the possibility of inhibition of the spike glycoprotein by toremifene, responsible for aiding in fusion of the viral membrane with the cell membrane, via a perturbation to the fusion core. An interaction between the dimethylamine end of toremifene and residues Q954 and N955 in heptad repeat 1 (HR1) perturbs the structure, causing a shift from what is normally a long, helical region to short helices connected by unstructured regions. Additionally, we found a strong interaction between toremifene and the methyltransferase non-structural protein (NSP) 14, which could be inhibitory to viral replication via its active site. These results suggest potential structural mechanisms for toremifene by blocking the spike protein and NSP14 of SARS-CoV-2, offering a drug candidate for COVID-19.


Sign in / Sign up

Export Citation Format

Share Document