scholarly journals Inference based PICRUSt accuracy varies across sample types and functional categories

2019 ◽  
Author(s):  
Shan Sun ◽  
Roshonda B. Jones ◽  
Anthony A. Fodor

AbstractBackgroundDespite recent decreases in the cost of sequencing, shotgun metagenome sequencing remains more expensive compared with 16S rRNA amplicon sequencing. Methods have been developed to predict the functional profiles of microbial communities based on their taxonomic composition, and PICRUSt is the most widely used of these techniques. In this study, we evaluated the performance of PICRUSt by comparing the significance of the differential abundance of functional gene profiles predicted with PICRUSt to those from shotgun metagenome sequencing across different environments.ResultsWe selected 7 datasets of human, non-human animal and environmental (soil) samples that have publicly available 16S rRNA and shotgun metagenome sequences. As we would expect based on previous literature, strong Spearman correlations were observed between gene compositions predicted with PICRUSt and measured with shotgun metagenome sequencing. However, these strong correlations were preserved even when the sample labels were shuffled. This suggests that simple correlation coefficient is a highly unreliable measure for the performance of algorithms like PICRUSt. As an alternative, we compared the performance of PICRUSt predicted genes to metagenome genes in inference models associated with metadata within each dataset. With this method, we found reasonable performance for human datasets, with PICRUSt performing better for inference on genes related to “house-keeping” functions. However, the performance of PICRUSt degraded sharply outside of human datasets when used for inference.ConclusionWe conclude that the utility of PICRUSt for inference with the default database is likely limited outside of human samples and that development of tools for gene prediction specific to different non-human and environmental samples is warranted.

2020 ◽  
Author(s):  
Shan Sun ◽  
Roshonda B. Jones ◽  
Anthony A. Fodor

Abstract Background: Despite recent decreases in the cost of sequencing, shotgun metagenome sequencing remains more expensive compared with 16S rRNA amplicon sequencing. Methods have been developed to predict the functional profiles of microbial communities based on their taxonomic composition. In this study, we evaluated the performance of three commonly used metagenome prediction tools (PICRUSt, PICRUSt2 and Tax4Fun) by comparing the significance of the differential abundance of predicted functional gene profiles to those from shotgun metagenome sequencing across different environments. Results: We selected 7 datasets of human, non-human animal and environmental (soil) samples that have publicly available 16S rRNA and shotgun metagenome sequences. As we would expect based on previous literature, strong Spearman correlations were observed between predicted gene compositions and gene relative abundance measured with shotgun metagenome sequencing. However, these strong correlations were preserved even when the abundance of genes were permuted across samples. This suggests that simple correlation coefficient is a highly unreliable measure for the performance of metagenome prediction tools. As an alternative, we compared the performance of genes predicted with PICRUSt, PICRUSt2 and Tax4Fun to sequenced metagenome genes in inference models associated with metadata within each dataset. With this approach, we found reasonable performance for human datasets, with the metagenome prediction tools performing better for inference on genes related to “house-keeping” functions. However, their performance degraded sharply outside of human datasets when used for inference. Conclusion: We conclude that the utility of PICRUSt, PICRUSt2 and Tax4Fun for inference with the default database is likely limited outside of human samples and that development of tools for gene prediction specific to different non-human and environmental samples is warranted.


2021 ◽  
Author(s):  
Shan Sun ◽  
Xiangzhu Zhu ◽  
Xiang Huang ◽  
Harvey J. Murff ◽  
Reid M. Ness ◽  
...  

AbstractThe gut microbiota plays an important role in human health and disease. Stool, swab and mucosal tissue samples have been used in individual studies to survey the microbial community but the consequences of using these different sample types are not completely understood. We previously reported differences in microbial community composition with 16S rRNA amplicon sequencing between stool, swab and mucosal tissue samples. Here, we extended the previous study to a larger cohort and performed shotgun metagenome sequencing of 1,397 stool, swab and mucosal tissue samples from 240 participants. Consistent with previous results, taxonomic composition of stool and swab samples was distinct, but still more similar to each other than mucosal tissue samples, which had a substantially different community composition, characterized by a high relative abundance of the mucus metabolizers Bacteroides and Subdoligranulum, as well as bacteria with higher tolerance for oxidative stress such as Escherichia. As has been previously reported, functional profiles were more uniform across sample types than taxonomic profiles with differences between stool and swab samples smaller, but mucosal tissue samples remained distinct from the other two types. When the taxonomic and functional profiles of different sample types were used for inference in association with host phenotypes of age, sex, body mass index (BMI), antibiotics or non-steroidal anti-inflammatory drugs (NSAIDs) use, hypothesis testing using either stool or swab gave broadly similar results, but inference performed on mucosal tissue samples gave results that were generally less consistent with either stool or swab. Our study represents an important resource for the experimental design of studies aimed to understand microbiota perturbations specific to defined micro niches within the human intestinal tract.


Biomolecules ◽  
2021 ◽  
Vol 11 (7) ◽  
pp. 999
Author(s):  
Helena Torrell ◽  
Adrià Cereto-Massagué ◽  
Polina Kazakova ◽  
Lorena García ◽  
Héctor Palacios ◽  
...  

Background: The human intestinal microbiome plays a central role in overall health status, especially in early life stages. 16S rRNA amplicon sequencing is used to profile its taxonomic composition; however, multiomic approaches have been proposed as the most accurate methods for study of the complexity of the gut microbiota. In this study, we propose an optimized method for bacterial diversity analysis that we validated and complemented with metabolomics by analyzing fecal samples. Methods: Forty-eight different analytical combinations regarding (1) 16S rRNA variable region sequencing, (2) a feature selection approach, and (3) taxonomy assignment methods were tested. A total of 18 infant fecal samples grouped depending on the type of feeding were analyzed by the proposed 16S rRNA workflow and by metabolomic analysis. Results: The results showed that the sole use of V4 region sequencing with ASV identification and VSEARCH for taxonomy assignment produced the most accurate results. The application of this workflow showed clear differences between fecal samples according to the type of feeding, which correlated with changes in the fecal metabolic profile. Conclusion: A multiomic approach using real fecal samples from 18 infants with different types of feeding demonstrated the effectiveness of the proposed 16S rRNA-amplicon sequencing workflow.


Nutrients ◽  
2021 ◽  
Vol 13 (7) ◽  
pp. 2414
Author(s):  
Laura Sanjulián ◽  
Alexandre Lamas ◽  
Rocío Barreiro ◽  
Alberto Cepeda ◽  
Cristina A. Fente ◽  
...  

The objective of this work was to characterize the microbiota of breast milk in healthy Spanish mothers and to investigate the effects of lactation time on its diversity. A total of ninety-nine human milk samples were collected from healthy Spanish women and were assessed by means of next-generation sequencing of 16S rRNA amplicons and by qPCR. Firmicutes was the most abundant phylum, followed by Bacteroidetes, Actinobacteria, and Proteobacteria. Accordingly, Streptococcus was the most abundant genus. Lactation time showed a strong influence in milk microbiota, positively correlating with Actinobacteria and Bacteroidetes, while Firmicutes was relatively constant over lactation. 16S rRNA amplicon sequencing showed that the highest alpha-diversity was found in samples of prolonged lactation, along with wider differences between individuals. As for milk nutrients, calcium, magnesium, and selenium levels were potentially associated with Streptococcus and Staphylococcus abundance. Additionally, Proteobacteria was positively correlated with docosahexaenoic acid (DHA) levels in breast milk, and Staphylococcus with conjugated linoleic acid. Conversely, Streptococcus and trans-palmitoleic acid showed a negative association. Other factors such as maternal body mass index or diet also showed an influence on the structure of these microbial communities. Overall, human milk in Spanish mothers appeared to be a complex niche shaped by host factors and by its own nutrients, increasing in diversity over time.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Caitlin M. Singleton ◽  
Francesca Petriglieri ◽  
Jannie M. Kristensen ◽  
Rasmus H. Kirkegaard ◽  
Thomas Y. Michaelsen ◽  
...  

AbstractMicroorganisms play crucial roles in water recycling, pollution removal and resource recovery in the wastewater industry. The structure of these microbial communities is increasingly understood based on 16S rRNA amplicon sequencing data. However, such data cannot be linked to functional potential in the absence of high-quality metagenome-assembled genomes (MAGs) for nearly all species. Here, we use long-read and short-read sequencing to recover 1083 high-quality MAGs, including 57 closed circular genomes, from 23 Danish full-scale wastewater treatment plants. The MAGs account for ~30% of the community based on relative abundance, and meet the stringent MIMAG high-quality draft requirements including full-length rRNA genes. We use the information provided by these MAGs in combination with >13 years of 16S rRNA amplicon sequencing data, as well as Raman microspectroscopy and fluorescence in situ hybridisation, to uncover abundant undescribed lineages belonging to important functional groups.


Helicobacter ◽  
2021 ◽  
Author(s):  
Boldbaatar Gantuya ◽  
Hashem B. El Serag ◽  
Batsaikhan Saruuljavkhlan ◽  
Dashdorj Azzaya ◽  
Takashi Matsumoto ◽  
...  

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Gongchao Jing ◽  
Yufeng Zhang ◽  
Wenzhi Cui ◽  
Lu Liu ◽  
Jian Xu ◽  
...  

Abstract Background Due to their much lower costs in experiment and computation than metagenomic whole-genome sequencing (WGS), 16S rRNA gene amplicons have been widely used for predicting the functional profiles of microbiome, via software tools such as PICRUSt 2. However, due to the potential PCR bias and gene profile variation among phylogenetically related genomes, functional profiles predicted from 16S amplicons may deviate from WGS-derived ones, resulting in misleading results. Results Here we present Meta-Apo, which greatly reduces or even eliminates such deviation, thus deduces much more consistent diversity patterns between the two approaches. Tests of Meta-Apo on > 5000 16S-rRNA amplicon human microbiome samples from 4 body sites showed the deviation between the two strategies is significantly reduced by using only 15 WGS-amplicon training sample pairs. Moreover, Meta-Apo enables cross-platform functional comparison between WGS and amplicon samples, thus greatly improve 16S-based microbiome diagnosis, e.g. accuracy of gingivitis diagnosis via 16S-derived functional profiles was elevated from 65 to 95% by WGS-based classification. Therefore, with the low cost of 16S-amplicon sequencing, Meta-Apo can produce a reliable, high-resolution view of microbiome function equivalent to that offered by shotgun WGS. Conclusions This suggests that large-scale, function-oriented microbiome sequencing projects can probably benefit from the lower cost of 16S-amplicon strategy, without sacrificing the precision in functional reconstruction that otherwise requires WGS. An optimized C++ implementation of Meta-Apo is available on GitHub (https://github.com/qibebt-bioinfo/meta-apo) under a GNU GPL license. It takes the functional profiles of a few paired WGS:16S-amplicon samples as training, and outputs the calibrated functional profiles for the much larger number of 16S-amplicon samples.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Shan Sun ◽  
Xiangzhu Zhu ◽  
Xiang Huang ◽  
Harvey J. Murff ◽  
Reid M. Ness ◽  
...  

AbstractThe gut microbiota plays an important role in human health and disease. Stool, rectal swab and rectal mucosal tissue samples have been used in individual studies to survey the microbial community but the consequences of using these different sample types are not completely understood. In this study, we report differences in stool, rectal swab and rectal mucosal tissue microbial communities with shotgun metagenome sequencing of 1397 stool, swab and mucosal tissue samples from 240 participants. The taxonomic composition of stool and swab samples was distinct, but less different to each other than mucosal tissue samples. Functional profile differences between stool and swab samples are smaller, but mucosal tissue samples remained distinct from the other two types. When the taxonomic and functional profiles were used for inference in association with host phenotypes of age, sex, body mass index (BMI), antibiotics or non-steroidal anti-inflammatory drugs (NSAIDs) use, hypothesis testing using either stool or rectal swab gave broadly significantly correlated results, but inference performed on mucosal tissue samples gave results that were generally less consistent with either stool or swab. Our study represents an important resource for determination of how inference can change for taxa and pathways depending on the choice of where to sample within the human gut.


Sign in / Sign up

Export Citation Format

Share Document