scholarly journals Phylogenetic Placement of Exact Amplicon Sequences Improves Associations with Clinical Information

mSystems ◽  
2018 ◽  
Vol 3 (3) ◽  
Author(s):  
Stefan Janssen ◽  
Daniel McDonald ◽  
Antonio Gonzalez ◽  
Jose A. Navas-Molina ◽  
Lingjing Jiang ◽  
...  

ABSTRACT Recent algorithmic advances in amplicon-based microbiome studies enable the inference of exact amplicon sequence fragments. These new methods enable the investigation of sub-operational taxonomic units (sOTU) by removing erroneous sequences. However, short (e.g., 150-nucleotide [nt]) DNA sequence fragments do not contain sufficient phylogenetic signal to reproduce a reasonable tree, introducing a barrier in the utilization of critical phylogenetically aware metrics such as Faith’s PD or UniFrac. Although fragment insertion methods do exist, those methods have not been tested for sOTUs from high-throughput amplicon studies in insertions against a broad reference phylogeny. We benchmarked the SATé-enabled phylogenetic placement (SEPP) technique explicitly against 16S V4 sequence fragments and showed that it outperforms the conceptually problematic but often-used practice of reconstructing de novo phylogenies. In addition, we provide a BSD-licensed QIIME2 plugin (https://github.com/biocore/q2-fragment-insertion) for SEPP and integration into the microbial study management platform QIITA. IMPORTANCE The move from OTU-based to sOTU-based analysis, while providing additional resolution, also introduces computational challenges. We demonstrate that one popular method of dealing with sOTUs (building a de novo tree from the short sequences) can provide incorrect results in human gut metagenomic studies and show that phylogenetic placement of the new sequences with SEPP resolves this problem while also yielding other benefits over existing methods.

2019 ◽  
Author(s):  
Niclas Ståhl ◽  
Göran Falkman ◽  
Alexander Karlsson ◽  
Gunnar Mathiason ◽  
Jonas Boström

<p>In medicinal chemistry programs it is key to design and make compounds that are efficacious and safe. This is a long, complex and difficult multi-parameter optimization process, often including several properties with orthogonal trends. New methods for the automated design of compounds against profiles of multiple properties are thus of great value. Here we present a fragment-based reinforcement learning approach based on an actor-critic model, for the generation of novel molecules with optimal properties. The actor and the critic are both modelled with bidirectional long short-term memory (LSTM) networks. The AI method learns how to generate new compounds with desired properties by starting from an initial set of lead molecules and then improve these by replacing some of their fragments. A balanced binary tree based on the similarity of fragments is used in the generative process to bias the output towards structurally similar molecules. The method is demonstrated by a case study showing that 93% of the generated molecules are chemically valid, and a third satisfy the targeted objectives, while there were none in the initial set.</p>


2021 ◽  
Author(s):  
Lore Van Espen ◽  
Emilie Glad Bak ◽  
Leen Beller ◽  
Lila Close ◽  
Ward Deboutte ◽  
...  

Abstract Background: Gut viruses are important players in the complex human gut microbial ecosystem. Recently, the number of human gut virome studies is steadily increasing, however we are still only scratching the surface of the immense viral diversity as many wet lab and bio-informatics challenges remain. In this study, 254 virus-enriched faecal metagenomes from 204 Danish subjects were used to generate a Danish Enteric Virome Catalogue (DEVoC) of 12,986 non-redundant viral genome sequences encoding 190,029 viral genes, which formed 67,921 orthologous groups. The DEVoC was used to characterize the composition of the healthy DEVoC gut viromes from 46 children and adolescents (6-18 years old) and 45 adults (40 -73 years old).Results: The majority of DEVoC viral sequences (67.3 %) and proteins (61.6 %) were not present in other (human gut) viral genome databases. Gut viromes of healthy Danish subjects mostly consisted of phages. While 39 phage genomes (PGs) were present in more than 10 healthy subjects, the degree of viral individuality was high. Among the 39 prevalent PGs, one was significantly more prevalent in the paediatric cohort, whereas two were more prevalent in adults. In 1,880 gut virome samples of 27 studies from across the world, the 39 prevalent PGs reveal several age-, geography- and disease-related prevalence patterns. Two PGs also showed a remarkably high prevalence worldwide – a crAss-like phage (20.6% prevalence), belonging to the tentative AlphacrAssvirinae subfamily, genus I; and a previously undescribed circular temperate phage (14.4% prevalence), named LoVEphage (because it encodes Lots of Viral Elements). A de novo assembly of selected public datasets generated an additional 18 circular LoVEphage-like genomes (67.9-72.4 kb). CRISPR spacer analysis suggested Bacteroides as a host genus for the LoVEphage, and a closely related prophage was identified in Bacteroides dorei, further confirming the host.Conclusions: The DEVoC, the largest human gut virome catalogue generated from consistently processed faecal samples, facilitated analysis of healthy Danish human gut viromes and we foresee that it will benefit future analysis on the roles of gut viruses in human health and disease. The identification of a previously undescribed prevalent phage illustrates the usefulness of developing a virome catalogue.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Daniel Roush ◽  
Ana Giraldo-Silva ◽  
Ferran Garcia-Pichel

AbstractCyanobacteria are a widespread and important bacterial phylum, responsible for a significant portion of global carbon and nitrogen fixation. Unfortunately, reliable and accurate automated classification of cyanobacterial 16S rRNA gene sequences is muddled by conflicting systematic frameworks, inconsistent taxonomic definitions (including the phylum itself), and database errors. To address this, we introduce Cydrasil 3 (https://www.cydrasil.org), a curated 16S rRNA gene reference package, database, and web application designed to provide a full phylogenetic perspective for cyanobacterial systematics and routine identification. Cydrasil 3 contains over 1300 manually curated sequences longer than 1100 base pairs and can be used for phylogenetic placement or as a reference sequence set for de novo phylogenetic reconstructions. The web application (utilizing PaPaRA and EPA-ng) can place thousands of sequences into the reference tree and has detailed instructions on how to analyze results. While the Cydrasil web application offers no taxonomic assignments, it instead provides phylogenetic placement, as well as a searchable database with curation notes and metadata, and a mechanism for community feedback.


2018 ◽  
Author(s):  
Lucilla Pizzo ◽  
Matthew Jensen ◽  
Andrew Polyak ◽  
Jill A. Rosenfeld ◽  
Katrin Mannik ◽  
...  

AbstractPurposeTo assess the contribution of rare variants in the genetic background towards variability of neurodevelopmental phenotypes in individuals with rare copy-number variants (CNVs) and gene-disruptive mutations.MethodsWe analyzed quantitative clinical information, exome-sequencing, and microarray data from 757 probands and 233 parents and siblings who carry disease-associated mutations.ResultsThe number of rare secondary mutations in functionally intolerant genes (second-hits) correlated with the expressivity of neurodevelopmental phenotypes in probands with 16p12.1 deletion (n=23, p=0.004) and in probands with autism carrying gene-disruptive mutations (n=184, p=0.03) compared to their carrier family members. Probands with 16p12.1 deletion and a strong family history presented more severe clinical features (p=0.04) and higher burden of second-hits compared to those with mild/no family history (p=0.001). The number of secondary variants also correlated with the severity of cognitive impairment in probands carrying pathogenic rare CNVs (n=53) or de novo mutations in disease genes (n=290), and negatively correlated with head size among 80 probands with 16p11.2 deletion. These second-hits involved known disease-associated genes such as SETD5, AUTS2, and NRXN1, and were enriched for genes affecting cellular and developmental processes.ConclusionAccurate genetic diagnosis of complex disorders will require complete evaluation of the genetic background even after a candidate gene mutation is identified.


Diversity ◽  
2019 ◽  
Vol 11 (10) ◽  
pp. 178
Author(s):  
Peter Houde

“Genomic Analyses of Avian Evolution” is a “state of the art” showcase of the varied and rapidly evolving fields of inquiry enabled and driven by powerful new methods of genome sequencing and assembly as they are applied to some of the world’s most familiar and charismatic organisms—birds. The contributions to this Special Issue are as eclectic as avian genomics itself, but loosely interrelated by common underpinnings of phylogenetic inference, de novo genome assembly of non-model species, and genome organization and content.


Microbiome ◽  
2019 ◽  
Vol 7 (1) ◽  
Author(s):  
Ilia G. Halatchev ◽  
David O’Donnell ◽  
Matthew C. Hibberd ◽  
Jeffrey I. Gordon

AbstractGiven the increasing use of gnotobiotic mouse models for deciphering the effects of human microbial communities on host biology, there is a need to develop new methods for characterizing these animals while maintaining their isolation from environmental microbes. We describe a method for performing open-circuit indirect calorimetry on gnotobiotic mice colonized with gut microbial consortia obtained from different human donors. In this illustrative case, cultured collections of gut bacterial strains were obtained from obese and lean co-twins. The approach allows microbial contributions to host energy homeostasis to be characterized.


2019 ◽  
Author(s):  
Andrey N. Shkoporov ◽  
Adam G. Clooney ◽  
Thomas D.S. Sutton ◽  
Feargal J. Ryan ◽  
Karen M. Daly ◽  
...  

SummaryThe human gut contains a vast array of viruses, mostly bacteriophages. The majority remain uncharacterised and their roles in shaping the gut microbiome and in impacting on human health remain poorly understood. Here we performed a longitudinal focused metagenomic study of faecal bacteriophage populations in healthy adults. Our results reveal high temporal stability and individual specificity of bacteriophage consortia which correlates with the bacterial microbiome. We report the existence of a stable, numerically predominant individual-specific persistent personal virome. Clustering of bacteriophage genomes and de novo taxonomic annotation identified several groups of crAss-like and Microviridae bacteriophages as the most stable colonizers of the human gut. CRISPR-based host prediction highlighted connections between these stable viral communities and highly predominant gut bacterial taxa such as Bacteroides, Prevotella and Faecalibacterium. This study provides insights into the structure of the human gut virome and serves as an important baseline for hypothesis-driven research.


2020 ◽  
Author(s):  
Yuya Kiguchi ◽  
Suguru Nishijima ◽  
Naveen Kumar ◽  
Masahira Hattori ◽  
Wataru Suda

Abstract Background: The ecological and biological features of the indigenous phage community (virome) in the human gut microbiome are poorly understood, possibly due to many fragmented contigs and fewer complete genomes based on conventional short-read metagenomics. Long-read sequencing technologies have attracted attention as an alternative approach to reconstruct long and accurate contigs from microbial communities. However, the impact of long-read metagenomics on human gut virome analysis has not been well evaluated. Results: Here we present chimera-less PacBio long-read metagenomics of multiple displacement amplification (MDA)-treated human gut virome DNA. The method included the development of a novel bioinformatics tool, SACRA (Split Amplified Chimeric Read Algorithm), which efficiently detects and splits numerous chimeric reads in PacBio reads from the MDA-treated virome samples. SACRA treatment of PacBio reads from five samples markedly reduced the average chimera ratio from 72 to 1.5%, generating chimera-less PacBio reads with an average read-length of 1.8 kb. De novo assembly of the chimera-less long reads generated contigs with an average N50 length of 11.1 kb, whereas those of MiSeq short reads from the same samples were 0.7 kb, dramatically improving contig extension. Alignment of both contig sets generated 378 high-quality merged contigs (MCs) composed of the minimum scaffolds of 434 MiSeq and 637 PacBio contigs, respectively, and also identified numerous MiSeq short fragmented contigs ≤500 bp additionally aligned to MCs, which possibly originated from a small fraction of MiSeq chimeric reads. The alignment also revealed that fragmentations of the scaffolded MiSeq contigs were caused primarily by genomic complexity of the community, including local repeats, hypervariable regions, and highly conserved sequences in and between the phage genomes. We identified 142 complete and near-complete phage genomes including 108 novel genomes, varying from 5 to 185 kb in length, the majority of which were predicted to be Microviridae phages including several variants with homologous but distinct genomes, which were fragmented in MiSeq contigs. Conclusions: Long-read metagenomics coupled with SACRA provides an improved method to reconstruct accurate and extended phage genomes from MDA-treated virome samples of the human gut, and potentially from other environmental virome samples.


Sign in / Sign up

Export Citation Format

Share Document