scholarly journals FuSe: a tool to move RNA-Seq analyses from chromosomal/gene loci to functional grouping of mRNA transcripts

Author(s):  
Rajinder Gupta ◽  
Yannick Schrooders ◽  
Marcha Verheijen ◽  
Adrian Roth ◽  
Jos Kleinjans ◽  
...  

Abstract Summary Typical RNA sequencing (RNA-Seq) analyses are performed either at the gene level by summing all reads from the same locus, assuming that all transcripts from a gene make a protein or at the transcript level, assuming that each transcript displays unique function. However, these assumptions are flawed, as a gene can code for different types of transcripts and different transcripts are capable of synthesizing similar, different or no protein. As a consequence, functional changes are not well illustrated by either gene or transcript analyses. We propose to improve RNA-Seq analyses by grouping the transcripts based on their similar functions. We developed FuSe to predict functional similarities using the primary and secondary structure of proteins. To estimate the likelihood of proteins with similar functions, FuSe computes two confidence scores: knowledge (KS) and discovery (DS) for protein pairs. Overlapping protein pairs exhibiting high confidence are grouped to form ‘similar function protein groups’ and expression is calculated for each functional group. The impact of using FuSe is demonstrated on in vitro cells exposed to paracetamol, which highlight genes responsible for cell adhesion and glycogen regulation which were earlier shown to be not differentially expressed with traditional analysis methods. Availability and implementation The source code is available at https://github.com/rajinder4489/FuSe. Data for APAP exposure are available in the BioStudies database (http://www.ebi.ac.uk/biostudies) under accession numbers S-HECA143, S-HECA(158) and S-HECA139. Supplementary information Supplementary data are available at Bioinformatics online.

Author(s):  
Tobias Tekath ◽  
Martin Dugas

Abstract Motivation Each year, the number of published bulk and single-cell RNA-seq data sets is growing exponentially. Studies analyzing such data are commonly looking at gene-level differences, while the collected RNA-seq data inherently represents reads of transcript isoform sequences. Utilizing transcriptomic quantifiers, RNA-seq reads can be attributed to specific isoforms, allowing for analysis of transcript-level differences. A differential transcript usage (DTU) analysis is testing for proportional differences in a gene’s transcript composition, and has been of rising interest for many research questions, such as analysis of differential splicing or cell type identification. Results We present the R package DTUrtle, the first DTU analysis workflow for both bulk and single-cell RNA-seq data sets, and the first package to conduct a ‘classical’ DTU analysis in a single-cell context. DTUrtle extends established statistical frameworks, offers various result aggregation and visualization options and a novel detection probability score for tagged-end data. It has been successfully applied to bulk and single-cell RNA-seq data of human and mouse, confirming and extending key results. Additionally, we present novel potential DTU applications like the identification of cell type specific transcript isoforms as biomarkers. Availability The R package DTUrtle is available at https://github.com/TobiTekath/DTUrtle with extensive vignettes and documentation at https://tobitekath.github.io/DTUrtle/. Supplementary information Supplementary data are available at Bioinformatics online.


F1000Research ◽  
2018 ◽  
Vol 7 ◽  
pp. 952 ◽  
Author(s):  
Michael I. Love ◽  
Charlotte Soneson ◽  
Rob Patro

Detection of differential transcript usage (DTU) from RNA-seq data is an important bioinformatic analysis that complements differential gene expression analysis. Here we present a simple workflow using a set of existing R/Bioconductor packages for analysis of DTU. We show how these packages can be used downstream of RNA-seq quantification using the Salmon software package. The entire pipeline is fast, benefiting from inference steps by Salmon to quantify expression at the transcript level. The workflow includes live, runnable code chunks for analysis using DRIMSeq and DEXSeq, as well as for performing two-stage testing of DTU using the stageR package, a statistical framework to screen at the gene level and then confirm which transcripts within the significant genes show evidence of DTU. We evaluate these packages and other related packages on a simulated dataset with parameters estimated from real data.


2020 ◽  
Author(s):  
Silvia Llonch ◽  
Montserrat Barragán ◽  
Paula Nieto ◽  
Anna Mallol ◽  
Marc Elosua-Bayes ◽  
...  

AbstractStudy questionTo which degree does maternal age affect the transcriptome of human oocytes at the germinal vesicle (GV) stage or at metaphase II after maturation in vitro (IVM-MII)?Summary answerWhile the oocytes’ transcriptome is predominantly determined by maturation stage, transcript levels of genes related to chromosome segregation, mitochondria and RNA processing are affected by age after in vitro maturation of denuded oocytes.What is known alreadyFemale fertility is inversely correlated with maternal age due to both a depletion of the oocyte pool and a reduction in oocyte developmental competence. Few studies have addressed the effect of maternal age on the human mature oocyte (MII) transcriptome, which is established during oocyte growth and maturation, and the pathways involved remain unclear. Here, we characterize and compare the transcriptomes of a large cohort of fully grown GV and IVM-MII oocytes from women of varying reproductive age.Study design, size, durationIn this prospective molecular study, 37 women were recruited from May 2018 to June 2019. The mean age was 28.8 years (SD=7.7, range 18-43). A total of 72 oocytes were included in the study at GV stage after ovarian stimulation, and analyzed as GV (n=40) and in vitro matured oocytes (IVM-MII; n=32).Participants/materials, setting, methodsDenuded oocytes were included either as GV at the time of ovum pick-up or as IVM-MII after in vitro maturation for 30 hours in G2™ medium, and processed for transcriptomic analysis by single-cell RNA-seq using the Smart-seq2 technology. Cluster and maturation stage marker analysis were performed using the Seurat R package. Genes with an average fold change greater than 2 and a p-value < 0.01 were considered maturation stage markers. A Pearson correlation test was used to identify genes whose expression levels changed progressively with age. Those genes presenting a correlation value (R) >= |0.3| and a p-value < 0.05 were considered significant.Main results and the role of chanceFirst, by exploration of the RNA-seq data using tSNE dimensionality reduction, we identified two clusters of cells reflecting the oocyte maturation stage (GV and IVM-MII) with 4,445 and 324 putative marker genes, respectively. Next we identified genes, for which RNA levels either progressively increased or decreased with age. This analysis was performed independently for GV and IVM-MII oocytes. Our results indicate that the transcriptome is more affected by age in IVM-MII oocytes (1,219 genes) than in GV oocytes (596 genes). In particular, we found that genes involved in chromosome segregation and RNA splicing significantly increase in transcript levels with age, while genes related to mitochondrial activity present lower transcript levels with age. Gene regulatory network analysis revealed potential upstream master regulator functions for genes whose transcript levels present positive (GPBP1, RLF, SON, TTF1) or negative (BNC1, THRB) correlation with age.Limitations, reasons for cautionIVM-MII oocytes used in this study were obtained after in vitro maturation of denuded GV oocytes, therefore, their transcriptome might not be fully representative of in vivo matured MII oocytes.The Smart-seq2 methodology used in this study detects polyadenylated transcripts only and we could therefore not assess non-polyadenylated transcripts.Wider implications of the findingsOur analysis suggests that advanced maternal age does not globally affect the oocyte transcriptome at GV or IVM-MII stages. Nonetheless, hundreds of genes displayed altered transcript levels with age, particularly in IVM-MII oocytes. Especially affected by age were genes related to chromosome segregation and mitochondrial function, pathways known to be involved in oocyte ageing. Our study thereby suggests that misregulation of chromosome segregation and mitochondrial pathways also at the RNA-level might contribute to the age-related quality decline in human oocytes.Study funding/competing interest(s)This study was funded by the AXA research fund, the European commission, intramural funding of Clinica EUGIN, the Spanish Ministry of Science, Innovation and Universities, the Catalan Agència de Gestió d’Ajuts Universitaris i de Recerca (AGAUR) and by contributions of the Spanish Ministry of Economy, Industry and Competitiveness (MEIC) to the EMBL partnership and to the “Centro de Excelencia Severo Ochoa”.The authors have no conflict of interest to declare.


2021 ◽  
Vol 36 (Supplement_1) ◽  
Author(s):  
Y Liu ◽  
C Jones ◽  
K Coward

Abstract Study question What is the mechanism of embryo hatching? Will laser-assisted zona pellucida (ZP) drilling alter the embryonic transcriptome? Summary answer Hatching is an ATP-dependent process. Hatching is also associated with Rho-mediated signaling. Laser-assisted ZP drilling might cause alternation in embryo metabolism. What is known already Embryo hatching is a vital process for early embryo development and implantation. Animal data suggests that hatching is the result of multiple factors, such as mechanical pressure, protease activation, and the regulation of maternal secretions. However, little is known about the regulatory signaling mechanisms and the molecules involved. In addition, despite the extensive use of laser-assisted ZP drilling in the clinic, the safety profile of this technique at molecular level is very sparse. The impact of this technique on the embryonic transcriptome has not been studied systematically. Study design, size, duration Eighty mouse embryos were randomly divided into a laser ZP drilling group (n = 40) and an untreated group (n = 40). After treatment, embryos were cultured in vitro for two days. Then, hatching blastocyst (n = 8) and pre-hatching blastocyst (n = 8) from the untreated group, and the hatching blastocyst from the treatment group (n = 8) were processed for RNA sequencing (RNA-seq). Participants/materials, setting, methods Cryopreserved 8-cell stage mouse embryos (B6C3F1 × B6D2F1) were thawed, and a laser was used to drill the embryo ZP in the treatment group. Next, the treated and untreated embryos were individually cultured in vitro to the E4.5 blastocyst stage. The resulting blastocysts were lysed individually and used for subsequent cDNA library preparation and RNA-seq. Following data quality control and alignment, the RNA-seq data were processed for differentially expressed gene analysis and downstream functional analysis. Main results and the role of chance According to the RNA-seq data, 275 differentially expressed genes (DEGs) (230 up-regulated and 45 down-regulated, adjusted P &lt; 0.05) were identified when comparing hatching and pre-hatching blastocysts in the control groups. Analysis suggested that the trophectoderm is the primary cell type involved in hatching, and revealed the potential molecules causing increased blastocyst hydrostatic pressure (Aqp3 and Cldn4). Functional enrichment analysis suggested that ATP metabolism and protein synthesis were activated in hatching blastocysts. DEGs were found to be significantly enriched in several gene ontology terms, particularly in terms of the organization of the cytoskeleton and actin polymerisation (P &lt; 0.0001). Furthermore, according to QIAGEN ingenuity pathway analysis results, Rho signaling was implicated in blastocyst hatching (Actb, Arpc2, Cfl1, Myl6, Pfn1, Rnd3, Septin9, z-score=2.65, P &lt; 0.0001). Moreover, the potential role of hormones (estrogen (z-score=2.24) and prolactin (z-score=2.4)) and growth factors (AGT (z-score=2.41) and FGF2 (z-score=2.213)) were implicated in the hatching process as indicated by the upstream regulator analysis. By comparing the transcriptome between laser-treated and untreated hatching blastocysts, 47 DEGs were identified (adjusted P &lt; 0.05) following laser-assisted ZP drilling. These genes were enriched in metabolism-related pathways (P &lt; 0.05), including the lipid metabolism pathway (Mvd, Mvk, Aacs, Gsk3a, Pik3c2a, Aldh9a1) and the xenobiotic metabolism pathway (Aldh18a1, Aldh9a1, Keap1, and Pik3c2a). Limitations, reasons for caution Findings in mouse embryos may not be fully representative of human embryos. Furthermore, the mechanism of hatching revealed here might only reflect the hatching process of embryos in vitro. Further studies are now necessary to confirm these findings in different conditions and species to determine their clinical significance. Wider implications of the findings: Our study profiled the mouse embryo transcriptome during in vitro hatching, identified potential key genes and mechanisms for future study. In addition, for the first time, we revealed the impact of laser-assisted ZP drilling on the transcriptome, this may help us to assess and improve the existing technique. Trial registration number Not applicable


2020 ◽  
Vol 2 (3) ◽  
Author(s):  
Yang Liao ◽  
Wei Shi

Abstract RNA sequencing (RNA-seq) is currently the standard method for genome-wide expression profiling. RNA-seq reads often need to be mapped to a reference genome before read counts can be produced for genes. Read trimming methods have been developed to assist read mapping by removing adapter sequences and low-sequencing-quality bases. It is however unclear what is the impact of read trimming on the quantification of RNA-seq data, an important task in RNA-seq data analysis. In this study, we used a benchmark RNA-seq dataset and simulation data to assess the impact of read trimming on mapping and quantification of RNA-seq reads. We found that adapter sequences can be effectively removed by read aligner via ’soft-clipping’ and that many low-sequencing-quality bases, which would be removed by read trimming tools, were rescued by the aligner. Accuracy of gene expression quantification from using untrimmed reads was found to be comparable to or slightly better than that from using trimmed reads, based on Pearson correlation with reverse transcriptase-polymerase chain reaction data and simulation truth. Total data analysis time was reduced by up to an order of magnitude when read trimming was not performed. Our study suggests that read trimming is a redundant process in the quantification of RNA-seq expression data.


2018 ◽  
Vol 9 (1) ◽  
pp. 21-34 ◽  
Author(s):  
K. Adamberg ◽  
K. Kolk ◽  
M. Jaagura ◽  
R. Vilu ◽  
S. Adamberg

The metabolic activity of colon microbiota is specifically affected by fibres with various monomer compositions, degree of polymerisation and branching. The supply of a variety of dietary fibres assures the diversity of gut microbial communities considered important for the well-being of the host. The aim of this study was to compare the impact of different oligo- and polysaccharides (galacto- and fructooligosaccharides, resistant starch, levan, inulin, arabinogalactan, xylan, pectin and chitin), and a glycoprotein mucin on the growth and metabolism of faecal microbiota in vitro by using isothermal microcalorimetry (IMC). Faecal samples from healthy donors were incubated in a phosphate-buffered defined medium with or without supplementation of a single substrate. The generation of heat was followed on-line, microbiota composition (V3-V4 region of the 16S rRNA using Illumina MiSeq v2) and concentrations of metabolites (HPLC) were determined at the end of growth. The multiauxic power-time curves obtained were substrate-specific. More than 70% of all substrates except chitin were fermented by faecal microbiota with total heat generation of up to 8 J/ml. The final metabolite patterns were in accordance with the microbiota changes. For arabinogalactan, xylan and levan, the fibre-affected distribution of bacterial taxa showed clear similarities (e.g. increase of Bacteroides ovatus and decrease of Bifidobacterium adolescentis). The formation of propionic acid, an important colon metabolite, was enhanced by arabinogalactan, xylan and mucin but not by galacto- and fructooligosaccharides or inulin. Mucin fermentation resulted in acetate, propionate and butyrate production in ratios previously observed for faecal samples, indicating that mucins may serve as major substrates for colon microbial population. IMC combined with analytical methods was shown to be an effective method for screening the impact of specific dietary fibres on functional changes in faecal microbiota.


2017 ◽  
Author(s):  
Koen Van den Berge ◽  
Charlotte Soneson ◽  
Mark D. Robinson ◽  
Lieven Clement

AbstractBackgroundReductions in sequencing cost and innovations in expression quantification have prompted an emergence of RNA-seq studies with complex designs and data analysis at transcript resolution. These applications involve multiple hypotheses per gene, leading to challenging multiple testing problems. Conventional approaches provide separate top-lists for every contrast and false discovery rate (FDR) control at individual hypothesis level. Hence, they fail to establish proper gene-level error control, which compromises downstream validation experiments. Tests that aggregate individual hypotheses are more powerful and provide gene-level FDR control, but in the RNA-seq literature no methods are available for post-hoc analysis of individual hypotheses.ResultsWe introduce a two-stage procedure that leverages the increased power of aggregated hypothesis tests while maintaining high biological resolution by post-hoc analysis of genes passing the screening hypothesis. Our method is evaluated on simulated and real RNA-seq experiments. It provides gene-level FDR control in studies with complex designs while boosting power for interaction effects without compromising the discovery of main effects. In a differential transcript usage/expression context, stage-wise testing gains power by aggregating hypotheses at the gene level, while providing transcript-level assessment of genes passing the screening stage. Finally, a prostate cancer case study highlights the relevance of combining gene with transcript level results.ConclusionStage-wise testing is a general paradigm that can be adopted whenever individual hypotheses can be aggregated. In our context, it achieves an optimal middle ground between biological resolution and statistical power while providing gene-level FDR control, which is beneficial for downstream biological interpretation and validation.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Anna-Klara Amler ◽  
Domenic Schlauch ◽  
Selin Tüzüner ◽  
Alexander Thomas ◽  
Norbert Neckel ◽  
...  

AbstractRadiotherapy of head and neck squamous cell carcinoma can lead to long-term complications like osteoradionecrosis, resulting in severe impairment of the jawbone. Current standard procedures require a 6-month wait after irradiation before dental reconstruction can begin. A comprehensive characterization of the irradiation-induced molecular and functional changes in bone cells could allow the development of novel strategies for an earlier successful dental reconstruction in patients treated by radiotherapy. The impact of ionizing radiation on the bone-forming alveolar osteoblasts remains however elusive, as previous studies have relied on animal-based models and fetal or animal-derived cell lines. This study presents the first in vitro data obtained from primary human alveolar osteoblasts. Primary human alveolar osteoblasts were isolated from healthy donors and expanded. After X-ray irradiation with 2, 6 and 10 Gy, cells were cultivated under osteogenic conditions and analyzed regarding their proliferation, mineralization, and expression of marker genes and proteins. Proliferation of osteoblasts decreased in a dose-dependent manner. While cells recovered from irradiation with 2 Gy, application of 6 and 10 Gy doses not only led to a permanent impairment of proliferation, but also resulted in altered cell morphology and a disturbed structure of the extracellular matrix as demonstrated by immunostaining of collagen I and fibronectin. Following irradiation with any of the examined doses, a decrease of marker gene expression levels was observed for most of the investigated genes, revealing interindividual differences. Primary human alveolar osteoblasts presented a considerably changed phenotype after irradiation, depending on the dose administered. Mechanisms for these findings need to be further investigated. This could facilitate improved patient care by re-evaluating current standard procedures and investigating faster and safer reconstruction concepts, thus improving quality of life and social integrity.


Author(s):  
Davide Risso ◽  
Stefano Maria Pagnotta

Abstract Motivation Data transformations are an important step in the analysis of RNA-seq data. Nonetheless, the impact of transformation on the outcome of unsupervised clustering procedures is still unclear. Results Here, we present an Asymmetric Winsorization per Sample Transformation (AWST), which is robust to data perturbations and removes the need for selecting the most informative genes prior to sample clustering. Our procedure leads to robust and biologically meaningful clusters both in bulk and in single-cell applications. Availability The AWST method is available at https://github.com/drisso/awst. The code to reproduce the analyses is available at https://github.com/drisso/awst\_analysis. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document