Inference based PICRUSt accuracy varies across sample types and functional categories

Mapping Intimacies ◽

10.1101/655746 ◽

2019 ◽

Cited By ~ 3

Author(s):

Shan Sun ◽

Roshonda B. Jones ◽

Anthony A. Fodor

Keyword(s):

16S Rrna ◽

Gene Prediction ◽

Amplicon Sequencing ◽

Taxonomic Composition ◽

Inference Models ◽

16S Rrna Amplicon Sequencing ◽

Gene Profiles ◽

Metagenome Sequencing ◽

The Cost ◽

Functional Profiles

AbstractBackgroundDespite recent decreases in the cost of sequencing, shotgun metagenome sequencing remains more expensive compared with 16S rRNA amplicon sequencing. Methods have been developed to predict the functional profiles of microbial communities based on their taxonomic composition, and PICRUSt is the most widely used of these techniques. In this study, we evaluated the performance of PICRUSt by comparing the significance of the differential abundance of functional gene profiles predicted with PICRUSt to those from shotgun metagenome sequencing across different environments.ResultsWe selected 7 datasets of human, non-human animal and environmental (soil) samples that have publicly available 16S rRNA and shotgun metagenome sequences. As we would expect based on previous literature, strong Spearman correlations were observed between gene compositions predicted with PICRUSt and measured with shotgun metagenome sequencing. However, these strong correlations were preserved even when the sample labels were shuffled. This suggests that simple correlation coefficient is a highly unreliable measure for the performance of algorithms like PICRUSt. As an alternative, we compared the performance of PICRUSt predicted genes to metagenome genes in inference models associated with metadata within each dataset. With this method, we found reasonable performance for human datasets, with PICRUSt performing better for inference on genes related to “house-keeping” functions. However, the performance of PICRUSt degraded sharply outside of human datasets when used for inference.ConclusionWe conclude that the utility of PICRUSt for inference with the default database is likely limited outside of human samples and that development of tools for gene prediction specific to different non-human and environmental samples is warranted.

Download Full-text

Inference based accuracy of metagenome prediction tools varies across sample types and functional categories

10.21203/rs.2.20233/v1 ◽

2020 ◽

Author(s):

Shan Sun ◽

Roshonda B. Jones ◽

Anthony A. Fodor

Keyword(s):

16S Rrna ◽

Gene Prediction ◽

Amplicon Sequencing ◽

Taxonomic Composition ◽

Inference Models ◽

Prediction Tools ◽

16S Rrna Amplicon Sequencing ◽

Metagenome Sequencing ◽

The Cost ◽

Functional Profiles

Abstract Background: Despite recent decreases in the cost of sequencing, shotgun metagenome sequencing remains more expensive compared with 16S rRNA amplicon sequencing. Methods have been developed to predict the functional profiles of microbial communities based on their taxonomic composition. In this study, we evaluated the performance of three commonly used metagenome prediction tools (PICRUSt, PICRUSt2 and Tax4Fun) by comparing the significance of the differential abundance of predicted functional gene profiles to those from shotgun metagenome sequencing across different environments. Results: We selected 7 datasets of human, non-human animal and environmental (soil) samples that have publicly available 16S rRNA and shotgun metagenome sequences. As we would expect based on previous literature, strong Spearman correlations were observed between predicted gene compositions and gene relative abundance measured with shotgun metagenome sequencing. However, these strong correlations were preserved even when the abundance of genes were permuted across samples. This suggests that simple correlation coefficient is a highly unreliable measure for the performance of metagenome prediction tools. As an alternative, we compared the performance of genes predicted with PICRUSt, PICRUSt2 and Tax4Fun to sequenced metagenome genes in inference models associated with metadata within each dataset. With this approach, we found reasonable performance for human datasets, with the metagenome prediction tools performing better for inference on genes related to “house-keeping” functions. However, their performance degraded sharply outside of human datasets when used for inference. Conclusion: We conclude that the utility of PICRUSt, PICRUSt2 and Tax4Fun for inference with the default database is likely limited outside of human samples and that development of tools for gene prediction specific to different non-human and environmental samples is warranted.

Download Full-text

On the robustness of inference of association with the gut microbiota in stool, swab and mucosal tissue samples

10.1101/2021.02.04.429844 ◽

2021 ◽

Author(s):

Shan Sun ◽

Xiangzhu Zhu ◽

Xiang Huang ◽

Harvey J. Murff ◽

Reid M. Ness ◽

...

Keyword(s):

Microbial Community ◽

Gut Microbiota ◽

Community Composition ◽

Microbial Community Composition ◽

Amplicon Sequencing ◽

Taxonomic Composition ◽

Mucosal Tissue ◽

Tissue Samples ◽

16S Rrna Amplicon Sequencing ◽

Functional Profiles

AbstractThe gut microbiota plays an important role in human health and disease. Stool, swab and mucosal tissue samples have been used in individual studies to survey the microbial community but the consequences of using these different sample types are not completely understood. We previously reported differences in microbial community composition with 16S rRNA amplicon sequencing between stool, swab and mucosal tissue samples. Here, we extended the previous study to a larger cohort and performed shotgun metagenome sequencing of 1,397 stool, swab and mucosal tissue samples from 240 participants. Consistent with previous results, taxonomic composition of stool and swab samples was distinct, but still more similar to each other than mucosal tissue samples, which had a substantially different community composition, characterized by a high relative abundance of the mucus metabolizers Bacteroides and Subdoligranulum, as well as bacteria with higher tolerance for oxidative stress such as Escherichia. As has been previously reported, functional profiles were more uniform across sample types than taxonomic profiles with differences between stool and swab samples smaller, but mucosal tissue samples remained distinct from the other two types. When the taxonomic and functional profiles of different sample types were used for inference in association with host phenotypes of age, sex, body mass index (BMI), antibiotics or non-steroidal anti-inflammatory drugs (NSAIDs) use, hypothesis testing using either stool or swab gave broadly similar results, but inference performed on mucosal tissue samples gave results that were generally less consistent with either stool or swab. Our study represents an important resource for the experimental design of studies aimed to understand microbiota perturbations specific to defined micro niches within the human intestinal tract.

Download Full-text

Multiomic Approach to Analyze Infant Gut Microbiota: Experimental and Analytical Method Optimization

Biomolecules ◽

10.3390/biom11070999 ◽

2021 ◽

Vol 11 (7) ◽

pp. 999

Author(s):

Helena Torrell ◽

Adrià Cereto-Massagué ◽

Polina Kazakova ◽

Lorena García ◽

Héctor Palacios ◽

...

Keyword(s):

16S Rrna ◽

Gut Microbiota ◽

Variable Region ◽

Amplicon Sequencing ◽

Taxonomic Composition ◽

Intestinal Microbiome ◽

Fecal Samples ◽

16S Rrna Amplicon Sequencing ◽

Method Optimization ◽

Feature Selection Approach

Background: The human intestinal microbiome plays a central role in overall health status, especially in early life stages. 16S rRNA amplicon sequencing is used to profile its taxonomic composition; however, multiomic approaches have been proposed as the most accurate methods for study of the complexity of the gut microbiota. In this study, we propose an optimized method for bacterial diversity analysis that we validated and complemented with metabolomics by analyzing fecal samples. Methods: Forty-eight different analytical combinations regarding (1) 16S rRNA variable region sequencing, (2) a feature selection approach, and (3) taxonomy assignment methods were tested. A total of 18 infant fecal samples grouped depending on the type of feeding were analyzed by the proposed 16S rRNA workflow and by metabolomic analysis. Results: The results showed that the sole use of V4 region sequencing with ASV identification and VSEARCH for taxonomy assignment produced the most accurate results. The application of this workflow showed clear differences between fecal samples according to the type of feeding, which correlated with changes in the fecal metabolic profile. Conclusion: A multiomic approach using real fecal samples from 18 infants with different types of feeding demonstrated the effectiveness of the proposed 16S rRNA-amplicon sequencing workflow.

Download Full-text

Comparative analysis of chicken cecal microbial diversity and taxonomic composition in response to dietary variation using 16S rRNA amplicon sequencing

Molecular Biology Reports ◽

10.1007/s11033-021-06712-3 ◽

2021 ◽

Author(s):

Zubia Rashid ◽

Muhammad Zubair Yousaf ◽

Syed Muddassar Hussain Gilani ◽

Sitwat Zehra ◽

Ashaq Ali ◽

...

Keyword(s):

Comparative Analysis ◽

16S Rrna ◽

Microbial Diversity ◽

Amplicon Sequencing ◽

Taxonomic Composition ◽

16S Rrna Amplicon Sequencing ◽

Dietary Variation

Download Full-text

Bacterial Diversity of Breast Milk in Healthy Spanish Women: Evolution from Birth to Five Years Postpartum

Nutrients ◽

10.3390/nu13072414 ◽

2021 ◽

Vol 13 (7) ◽

pp. 2414

Author(s):

Laura Sanjulián ◽

Alexandre Lamas ◽

Rocío Barreiro ◽

Alberto Cepeda ◽

Cristina A. Fente ◽

...

Keyword(s):

Breast Milk ◽

16S Rrna ◽

Human Milk ◽

Alpha Diversity ◽

Amplicon Sequencing ◽

Maternal Body Mass Index ◽

16S Rrna Amplicon Sequencing ◽

Spanish Women ◽

Calcium Magnesium ◽

Abundant Genus

The objective of this work was to characterize the microbiota of breast milk in healthy Spanish mothers and to investigate the effects of lactation time on its diversity. A total of ninety-nine human milk samples were collected from healthy Spanish women and were assessed by means of next-generation sequencing of 16S rRNA amplicons and by qPCR. Firmicutes was the most abundant phylum, followed by Bacteroidetes, Actinobacteria, and Proteobacteria. Accordingly, Streptococcus was the most abundant genus. Lactation time showed a strong influence in milk microbiota, positively correlating with Actinobacteria and Bacteroidetes, while Firmicutes was relatively constant over lactation. 16S rRNA amplicon sequencing showed that the highest alpha-diversity was found in samples of prolonged lactation, along with wider differences between individuals. As for milk nutrients, calcium, magnesium, and selenium levels were potentially associated with Streptococcus and Staphylococcus abundance. Additionally, Proteobacteria was positively correlated with docosahexaenoic acid (DHA) levels in breast milk, and Staphylococcus with conjugated linoleic acid. Conversely, Streptococcus and trans-palmitoleic acid showed a negative association. Other factors such as maternal body mass index or diet also showed an influence on the structure of these microbial communities. Overall, human milk in Spanish mothers appeared to be a complex niche shaped by host factors and by its own nutrients, increasing in diversity over time.

Download Full-text

Connecting structure to function with the recovery of over 1000 high-quality metagenome-assembled genomes from activated sludge using long-read sequencing

Nature Communications ◽

10.1038/s41467-021-22203-2 ◽

2021 ◽

Vol 12 (1) ◽

Cited By ~ 2

Author(s):

Caitlin M. Singleton ◽

Francesca Petriglieri ◽

Jannie M. Kristensen ◽

Rasmus H. Kirkegaard ◽

Thomas Y. Michaelsen ◽

...

Keyword(s):

16S Rrna ◽

Wastewater Treatment Plants ◽

In Situ Hybridisation ◽

Amplicon Sequencing ◽

Rrna Genes ◽

Fluorescence In Situ Hybridisation ◽

Sequencing Data ◽

High Quality ◽

16S Rrna Amplicon Sequencing ◽

Long Read

AbstractMicroorganisms play crucial roles in water recycling, pollution removal and resource recovery in the wastewater industry. The structure of these microbial communities is increasingly understood based on 16S rRNA amplicon sequencing data. However, such data cannot be linked to functional potential in the absence of high-quality metagenome-assembled genomes (MAGs) for nearly all species. Here, we use long-read and short-read sequencing to recover 1083 high-quality MAGs, including 57 closed circular genomes, from 23 Danish full-scale wastewater treatment plants. The MAGs account for ~30% of the community based on relative abundance, and meet the stringent MIMAG high-quality draft requirements including full-length rRNA genes. We use the information provided by these MAGs in combination with >13 years of 16S rRNA amplicon sequencing data, as well as Raman microspectroscopy and fluorescence in situ hybridisation, to uncover abundant undescribed lineages belonging to important functional groups.

Download Full-text

Advantage of 16S rRNA amplicon sequencing in Helicobacter pylori diagnosis

Helicobacter ◽

10.1111/hel.12790 ◽

2021 ◽

Author(s):

Boldbaatar Gantuya ◽

Hashem B. El Serag ◽

Batsaikhan Saruuljavkhlan ◽

Dashdorj Azzaya ◽

Takashi Matsumoto ◽

...

Keyword(s):

Helicobacter Pylori ◽

16S Rrna ◽

Amplicon Sequencing ◽

16S Rrna Amplicon Sequencing

Download Full-text

Meta-Apo improves accuracy of 16S-amplicon-based prediction of microbiome function

BMC Genomics ◽

10.1186/s12864-020-07307-1 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Gongchao Jing ◽

Yufeng Zhang ◽

Wenzhi Cui ◽

Lu Liu ◽

Jian Xu ◽

...

Keyword(s):

16S Rrna ◽

Large Scale ◽

Low Cost ◽

Human Microbiome ◽

Amplicon Sequencing ◽

Training Sample ◽

Rrna Gene ◽

16S Amplicon Sequencing ◽

Cross Platform ◽

Functional Profiles

Abstract Background Due to their much lower costs in experiment and computation than metagenomic whole-genome sequencing (WGS), 16S rRNA gene amplicons have been widely used for predicting the functional profiles of microbiome, via software tools such as PICRUSt 2. However, due to the potential PCR bias and gene profile variation among phylogenetically related genomes, functional profiles predicted from 16S amplicons may deviate from WGS-derived ones, resulting in misleading results. Results Here we present Meta-Apo, which greatly reduces or even eliminates such deviation, thus deduces much more consistent diversity patterns between the two approaches. Tests of Meta-Apo on > 5000 16S-rRNA amplicon human microbiome samples from 4 body sites showed the deviation between the two strategies is significantly reduced by using only 15 WGS-amplicon training sample pairs. Moreover, Meta-Apo enables cross-platform functional comparison between WGS and amplicon samples, thus greatly improve 16S-based microbiome diagnosis, e.g. accuracy of gingivitis diagnosis via 16S-derived functional profiles was elevated from 65 to 95% by WGS-based classification. Therefore, with the low cost of 16S-amplicon sequencing, Meta-Apo can produce a reliable, high-resolution view of microbiome function equivalent to that offered by shotgun WGS. Conclusions This suggests that large-scale, function-oriented microbiome sequencing projects can probably benefit from the lower cost of 16S-amplicon strategy, without sacrificing the precision in functional reconstruction that otherwise requires WGS. An optimized C++ implementation of Meta-Apo is available on GitHub (https://github.com/qibebt-bioinfo/meta-apo) under a GNU GPL license. It takes the functional profiles of a few paired WGS:16S-amplicon samples as training, and outputs the calibrated functional profiles for the much larger number of 16S-amplicon samples.

Download Full-text

Lactulose significantly increased the relative abundance of Bifidobacterium and Blautia in mice feces as revealed by 16S rRNA amplicon sequencing

Journal of the Science of Food and Agriculture ◽

10.1002/jsfa.11181 ◽

2021 ◽

Author(s):

Shumao Cui ◽

Jiayu Gu ◽

Xuemei Liu ◽

Dongyao Li ◽

Bingyong Mao ◽

...

Keyword(s):

16S Rrna ◽

Relative Abundance ◽

Amplicon Sequencing ◽

16S Rrna Amplicon Sequencing

Download Full-text

On the robustness of inference of association with the gut microbiota in stool, rectal swab and mucosal tissue samples

Scientific Reports ◽

10.1038/s41598-021-94205-5 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Shan Sun ◽

Xiangzhu Zhu ◽

Xiang Huang ◽

Harvey J. Murff ◽

Reid M. Ness ◽

...

Keyword(s):

Gut Microbiota ◽

Rectal Swab ◽

Taxonomic Composition ◽

Mucosal Tissue ◽

Tissue Samples ◽

Health And Disease ◽

Inflammatory Drugs ◽

Metagenome Sequencing ◽

Functional Profiles

AbstractThe gut microbiota plays an important role in human health and disease. Stool, rectal swab and rectal mucosal tissue samples have been used in individual studies to survey the microbial community but the consequences of using these different sample types are not completely understood. In this study, we report differences in stool, rectal swab and rectal mucosal tissue microbial communities with shotgun metagenome sequencing of 1397 stool, swab and mucosal tissue samples from 240 participants. The taxonomic composition of stool and swab samples was distinct, but less different to each other than mucosal tissue samples. Functional profile differences between stool and swab samples are smaller, but mucosal tissue samples remained distinct from the other two types. When the taxonomic and functional profiles were used for inference in association with host phenotypes of age, sex, body mass index (BMI), antibiotics or non-steroidal anti-inflammatory drugs (NSAIDs) use, hypothesis testing using either stool or rectal swab gave broadly significantly correlated results, but inference performed on mucosal tissue samples gave results that were generally less consistent with either stool or swab. Our study represents an important resource for determination of how inference can change for taxa and pathways depending on the choice of where to sample within the human gut.

Download Full-text