scholarly journals Connecting structure to function with the recovery of over 1000 high-quality activated sludge metagenome-assembled genomes encoding full-length rRNA genes using long-read sequencing

Author(s):  
Caitlin M Singleton ◽  
Francesca Petriglieri ◽  
Jannie M Kristensen ◽  
Rasmus H Kirkegaard ◽  
Thomas Y Michaelsen ◽  
...  

AbstractMicroorganisms are critical to water recycling, pollution removal and resource recovery processes in the wastewater industry. While the structure of this complex community is increasingly understood based on 16S rRNA gene studies, this structure cannot currently be linked to functional potential due to the absence of high-quality metagenome-assembled genomes (MAGs) with full-length rRNA genes for nearly all species. Here, we sequence 23 Danish full-scale wastewater treatment plant metagenomes, producing >1 Tbp of long-read and >0.9 Tbp of short-read data. We recovered 1083 high-quality MAGs, including 57 closed circular genomes. The MAGs accounted for ~30% of the community, and meet the stringent MIMAG high-quality draft requirements including full-length rRNA genes. We show how novel high-quality MAGs in combination with >13 years of amplicon data, Raman microspectroscopy and fluorescence in situ hybridisation can be used to uncover abundant undescribed lineages belonging to important functional groups.

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Caitlin M. Singleton ◽  
Francesca Petriglieri ◽  
Jannie M. Kristensen ◽  
Rasmus H. Kirkegaard ◽  
Thomas Y. Michaelsen ◽  
...  

AbstractMicroorganisms play crucial roles in water recycling, pollution removal and resource recovery in the wastewater industry. The structure of these microbial communities is increasingly understood based on 16S rRNA amplicon sequencing data. However, such data cannot be linked to functional potential in the absence of high-quality metagenome-assembled genomes (MAGs) for nearly all species. Here, we use long-read and short-read sequencing to recover 1083 high-quality MAGs, including 57 closed circular genomes, from 23 Danish full-scale wastewater treatment plants. The MAGs account for ~30% of the community based on relative abundance, and meet the stringent MIMAG high-quality draft requirements including full-length rRNA genes. We use the information provided by these MAGs in combination with >13 years of 16S rRNA amplicon sequencing data, as well as Raman microspectroscopy and fluorescence in situ hybridisation, to uncover abundant undescribed lineages belonging to important functional groups.


Microbiome ◽  
2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Benjamin J. Callahan ◽  
Dmitry Grinevich ◽  
Siddhartha Thakur ◽  
Michael A. Balamotis ◽  
Tuval Ben Yehezkel

Abstract Background Out of the many pathogenic bacterial species that are known, only a fraction are readily identifiable directly from a complex microbial community using standard next generation DNA sequencing. Long-read sequencing offers the potential to identify a wider range of species and to differentiate between strains within a species, but attaining sufficient accuracy in complex metagenomes remains a challenge. Methods Here, we describe and analytically validate LoopSeq, a commercially available synthetic long-read (SLR) sequencing technology that generates highly accurate long reads from standard short reads. Results LoopSeq reads are sufficiently long and accurate to identify microbial genes and species directly from complex samples. LoopSeq perfectly recovered the full diversity of 16S rRNA genes from known strains in a synthetic microbial community. Full-length LoopSeq reads had a per-base error rate of 0.005%, which exceeds the accuracy reported for other long-read sequencing technologies. 18S-ITS and genomic sequencing of fungal and bacterial isolates confirmed that LoopSeq sequencing maintains that accuracy for reads up to 6 kb in length. LoopSeq full-length 16S rRNA reads could accurately classify organisms down to the species level in rinsate from retail meat samples, and could differentiate strains within species identified by the CDC as potential foodborne pathogens. Conclusions The order-of-magnitude improvement in length and accuracy over standard Illumina amplicon sequencing achieved with LoopSeq enables accurate species-level and strain identification from complex- to low-biomass microbiome samples. The ability to generate accurate and long microbiome sequencing reads using standard short read sequencers will accelerate the building of quality microbial sequence databases and removes a significant hurdle on the path to precision microbial genomics.


Microbiome ◽  
2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Yusuke Okazaki ◽  
Shohei Fujinaga ◽  
Michaela M. Salcher ◽  
Cristiana Callieri ◽  
Atsushi Tanaka ◽  
...  

Abstract Background Freshwater ecosystems are inhabited by members of cosmopolitan bacterioplankton lineages despite the disconnected nature of these habitats. The lineages are delineated based on > 97% 16S rRNA gene sequence similarity, but their intra-lineage microdiversity and phylogeography, which are key to understanding the eco-evolutional processes behind their ubiquity, remain unresolved. Here, we applied long-read amplicon sequencing targeting nearly full-length 16S rRNA genes and the adjacent ribosomal internal transcribed spacer sequences to reveal the intra-lineage diversities of pelagic bacterioplankton assemblages in 11 deep freshwater lakes in Japan and Europe. Results Our single nucleotide-resolved analysis, which was validated using shotgun metagenomic sequencing, uncovered 7–101 amplicon sequence variants for each of the 11 predominant bacterial lineages and demonstrated sympatric, allopatric, and temporal microdiversities that could not be resolved through conventional approaches. Clusters of samples with similar intra-lineage population compositions were identified, which consistently supported genetic isolation between Japan and Europe. At a regional scale (up to hundreds of kilometers), dispersal between lakes was unlikely to be a limiting factor, and environmental factors or genetic drift were potential determinants of population composition. The extent of microdiversification varied among lineages, suggesting that highly diversified lineages (e.g., Iluma-A2 and acI-A1) achieve their ubiquity by containing a consortium of genotypes specific to each habitat, while less diversified lineages (e.g., CL500-11) may be ubiquitous due to a small number of widespread genotypes. The lowest extent of intra-lineage diversification was observed among the dominant hypolimnion-specific lineage (CL500-11), suggesting that their dispersal among lakes is not limited despite the hypolimnion being a more isolated habitat than the epilimnion. Conclusions Our novel approach complemented the limited resolution of short-read amplicon sequencing and limited sensitivity of the metagenome assembly-based approach, and highlighted the complex ecological processes underlying the ubiquity of freshwater bacterioplankton lineages. To fully exploit the performance of the method, its relatively low read throughput is the major bottleneck to be overcome in the future.


2019 ◽  
Vol 67 (7) ◽  
pp. 521
Author(s):  
Magdalena Vaio ◽  
Cristina Mazzella ◽  
Marcelo Guerra ◽  
Pablo Speranza

The Dilatata group of Paspalum includes species and biotypes native to temperate South America. Among them, five sexual allotetraploids (x = 10) share the same IIJJ genome formula: P. urvillei Steud, P. dasypleurum Kunze ex Desv., P. dilatatum subsp. flavescens Roseng., B.R. Arrill. & Izag., and two biotypes P. dilatatum Vacaria and P. dilatatum Virasoro. Previous studies suggested P. intermedium Munro ex Morong & Britton and P. juergensii Hack. or related species as their putative progenitors and donors of the I and J genome, respectively, and pointed to a narrow genetic base for their maternal origin. It has not yet been established whether the various members of the Dilatata group are the result of a single or of multiple allopolyploid formations. Here, we aimed to study the evolutionary dynamics of rRNA genes after allopolyploidisation in the Dilatata group of Paspalum and shed some light into the genome restructuring of the tetraploid taxa with the same genome formula. We used double target fluorescence in situ hybridisation of 35S and 5S rDNA probes and sequenced the nrDNA internal transcribed spacer (ITS) region. A variable number of loci at the chromosome ends were observed for the 35S rDNA, from 2 to 6, suggesting gain and loss of sites. For the 5S rDNA, only one centromeric pair of signals was observed, indicating a remarkable loss after polyploidisation. All ITS sequences generated were near identical to the one found for P. intermedium. Although sequences showed a directional homogeneisation towards the putative paternal progenitor in all tetraploid species, the observed differences in the number and loss of rDNA sites suggest independent ongoing diploidisation processes in all taxa and genome restructuring following polyploidy.


PeerJ ◽  
2016 ◽  
Vol 4 ◽  
pp. e2492 ◽  
Author(s):  
Catherine M. Burke ◽  
Aaron E. Darling

BackgroundThe bacterial 16S rRNA gene has historically been used in defining bacterial taxonomy and phylogeny. However, there are currently no high-throughput methods to sequence full-length 16S rRNA genes present in a sample with precision.ResultsWe describe a method for sequencing near full-length 16S rRNA gene amplicons using the high throughput Illumina MiSeq platform and test it using DNA from human skin swab samples. Proof of principle of the approach is demonstrated, with the generation of 1,604 sequences greater than 1,300 nt from a single Nano MiSeq run, with accuracy estimated to be 100-fold higher than standard Illumina reads. The reads were chimera filtered using information from a single molecule dual tagging scheme that boosts the signal available for chimera detection.ConclusionsThis method could be scaled up to generate many thousands of sequences per MiSeq run and could be applied to other sequencing platforms. This has great potential for populating databases with high quality, near full-length 16S rRNA gene sequences from under-represented taxa and environments and facilitates analyses of microbial communities at higher resolution.


2020 ◽  
Vol 13 (1) ◽  
Author(s):  
Jenny G. Maloney ◽  
Aleksey Molokin ◽  
Monica Santin

Abstract Background Blastocystis sp. is one of the most common enteric parasites of humans and animals worldwide. It is well recognized that this ubiquitous protist displays a remarkable degree of genetic diversity in the SSU rRNA gene, which is currently the main gene used for defining Blastocystis subtypes. Yet, full-length reference sequences of this gene are available for only 16 subtypes of Blastocystis in part because of the technical difficulties associated with obtaining these sequences from complex samples. Methods We have developed a method using Oxford Nanopore MinION long-read sequencing and universal eukaryotic primers to produce full-length (> 1800 bp) SSU rRNA gene sequences for Blastocystis. Seven Blastocystis specimens representing five subtypes (ST1, ST4, ST10, ST11, and ST14) obtained both from cultures and feces were used for validation. Results We demonstrate that this method can be used to produce highly accurate full-length sequences from both cultured and fecal DNA isolates. Full-length sequences were successfully obtained from all five subtypes including ST11 for which no full-length reference sequence currently exists and for an isolate that contained mixed ST10/ST14. Conclusions The suitability of the use of MinION long-read sequencing technology to successfully generate full-length Blastocystis SSU rRNA gene sequences was demonstrated. The ability to produce full-length SSU rRNA gene sequences is key in understanding the role of genetic diversity in important aspects of Blastocystis biology such as transmission, host specificity, and pathogenicity.


mBio ◽  
2016 ◽  
Vol 7 (3) ◽  
Author(s):  
Patrick D. Schloss ◽  
Rene A. Girard ◽  
Thomas Martin ◽  
Joshua Edwards ◽  
J. Cameron Thrash

ABSTRACT A census is typically carried out for people across a range of geographical levels; however, microbial ecologists have implemented a molecular census of bacteria and archaea by sequencing their 16S rRNA genes. We assessed how well the census of full-length 16S rRNA gene sequences is proceeding in the context of recent advances in high-throughput sequencing technologies because full-length sequences are typically used as references for classification of the short sequences generated by newer technologies. Among the 1,411,234 and 53,546 full-length bacterial and archaeal sequences, 94.5% and 95.1% of the bacterial and archaeal sequences, respectively, belonged to operational taxonomic units (OTUs) that have been observed more than once. Although these metrics suggest that the census is approaching completion, 29.2% of the bacterial and 38.5% of the archaeal OTUs have been observed more than once. Thus, there is still considerable diversity to be explored. Unfortunately, the rate of new full-length sequences has been declining, and new sequences are primarily being deposited by a small number of studies. Furthermore, sequences from soil and aquatic environments, which are known to be rich in bacterial diversity, represent only 7.8 and 16.5% of the census, while sequences associated with host-associated environments represent 55.0% of the census. Continued use of traditional approaches and new technologies such as single-cell genomics and short-read assembly are likely to improve our ability to sample rare OTUs if it is possible to overcome this sampling bias. The success of ongoing efforts to use short-read sequencing to characterize archaeal and bacterial communities requires that researchers strive to expand the depth and breadth of this census. IMPORTANCE The biodiversity contained within the bacterial and archaeal domains dwarfs that of the eukaryotes, and the services these organisms provide to the biosphere are critical. Surprisingly, we have done a relatively poor job of formally tracking the quality of the biodiversity as represented in full-length 16S rRNA genes. By understanding how this census is proceeding, it is possible to suggest the best allocation of resources for advancing the census. We found that the ongoing effort has done an excellent job of sampling the most abundant organisms but struggles to sample the rarer organisms. Through the use of new sequencing technologies, we should be able to obtain full-length sequences from these rare organisms. Furthermore, we suggest that by allocating more resources to sampling environments known to have the greatest biodiversity, we will be able to make significant advances in our characterization of archaeal and bacterial diversity.


2021 ◽  
Author(s):  
Yuta Kinoshita ◽  
Hidekazu NIWA ◽  
Eri UCHIDA-FUJII ◽  
Toshio NUKADA

Abstract Microbial communities are commonly studied by using amplicon sequencing of part of the 16S rRNA gene. Sequencing of the full-length 16S rRNA gene can provide higher taxonomic resolution and accuracy. To obtain even higher taxonomic resolution, with as few false-positives as possible, we assessed a method using long amplicon sequencing targeting the rRNA operon combined with a CCMetagen pipeline. Taxonomic assignment had >90% accuracy at the species level in a mock sample and at the family level in equine fecal samples, generating similar taxonomic composition as shotgun sequencing. The rRNA operon amplicon sequencing of equine fecal samples underestimated compositional percentages of bacterial strains containing unlinked rRNA genes by a third to almost a half, but unlinked rRNA genes had a limited effect on the overall results. The rRNA operon amplicon sequencing with the A519F + U2428R primer set was able to reflect archaeal genomes, whereas full-length 16S rRNA with 27F + 1492R could not. Therefore, we conclude that amplicon sequencing targeting the rRNA operon captures more detailed variations of bacterial and archaeal microbiota.


2020 ◽  
Author(s):  
Congmin Zhu ◽  
Junyi Zhang ◽  
Xin Wang ◽  
Yuqing Yang ◽  
Ning Chen ◽  
...  

Abstract BackgroundFreshwater lakes are threatened by harmful cyanobacterial blooms; whose basic unit is Cyanobacterial Aggregate (CA). Community variations of CA-attached bacteria are substantial during different blooming stages. However, little is known about their transcriptional and metabolic variations. Most bacterial genomes in CA were not constructed in existing database, which limits our understanding of the bacterial variations as responses to cyanobacterial blooms. ResultsIn this longitudinal study, 16 CA samples were collected from Lake Taihu, one of the largest freshwater lakes in China, from April of 2015 to February of 2016. By sequencing the V4 region of 16S rRNA genes, full metagenomes (MG) and metatranscriptomes (MT), we generated 424 Mb of 16S rRNA gene data, 122 Gb of high-quality MG data and 160 Gb of high-quality MT data. We analyzed the taxonomic, functional and transcriptional variations of microbes in CAs along three blooming stages, and constructed metagenome-assembled genomes (MAGs) by binning analysis. First, 55 OTUs, 456 genes and 37 transcripts mainly associated with pathways of transporters, photosystem and energy metabolism showed significantly different abundance among the three stages. Second, 161 high-quality MAGs in CAs were achieved, with 19 of which significantly shifted in relative abundance among three stages. The most abundant MAGs have gene capacities to synthesize flagella and divers of transporters, and participate in metabolic pathways of nitrogen, phosphorus and sulfur. Finally, 22 high-quality cyanobacterial MAGs were constructed and can be divided into four functional clusters, which showed significant differences on the energy pathways, transporters and prokaryotic defense system.ConclusionOverall, these results demonstrated the taxonomic, functional and transcriptional variations of microbes in CAs among three different blooming stages. Genome construction and metabolic analysis of cyanobacteria and their attached bacteria suggested that the material exchange and signal transmission do, indeed, exist among them. Our understanding of the underlying molecular pathways for cyanobacterial blooms could potentially lead to the control of blooms by interventional strategies to disrupt the expression of critical microbes.


Gut ◽  
2020 ◽  
Vol 69 (10) ◽  
pp. 1796-1806 ◽  
Author(s):  
Lucas Massier ◽  
Rima Chakaroun ◽  
Shirin Tabei ◽  
Alyce Crane ◽  
Konrad David Didt ◽  
...  

ObjectiveBacterial translocation to various organs including human adipose tissue (AT) due to increased intestinal permeability remains poorly understood. We hypothesised that: (1) bacterial presence is highly tissue specific and (2) related in composition and quantity to immune inflammatory and metabolic burden.DesignWe quantified and sequenced the bacterial 16S rRNA gene in blood and AT samples (omental, mesenteric and subcutaneous) of 75 subjects with obesity with or without type 2 diabetes (T2D) and used catalysed reporter deposition (CARD) – fluorescence in situ hybridisation (FISH) to detect bacteria in AT.ResultsUnder stringent experimental and bioinformatic control for contaminants, bacterial DNA was detected in blood and omental, subcutaneous and mesenteric AT samples in the range of 0.1 to 5 pg/µg DNA isolate. Moreover, CARD-FISH allowed the detection of living, AT-borne bacteria. Proteobacteria and Firmicutes were the predominant phyla, and bacterial quantity was associated with immune cell infiltration, inflammatory and metabolic parameters in a tissue-specific manner. Bacterial composition differed between subjects with and without T2D and was associated with related clinical measures, including systemic and tissues-specific inflammatory markers. Finally, treatment of adipocytes with bacterial DNA in vitro stimulated the expression of TNFA and IL6.ConclusionsOur study provides contaminant aware evidence for the presence of bacteria and bacterial DNA in several ATs in obesity and T2D and suggests an important role of bacteria in initiating and sustaining local AT subclinical inflammation and therefore impacting metabolic sequelae of obesity.


Sign in / Sign up

Export Citation Format

Share Document