scholarly journals Poly(A)-ClickSeq: click-chemistry for next-generation 3´-end sequencing without RNA enrichment or fragmentation

2017 ◽  
Author(s):  
Andrew Routh ◽  
Ping Ji ◽  
Elizabeth Jaworski ◽  
Zheng Xia ◽  
Wei Li ◽  
...  

AbstractThe recent emergence of alternative polyadenylation (APA) as an engine driving transcriptomic diversity has stimulated the development of sequencing methodologies designed to assess genome-wide polyadenylation events. The goal of these approaches is to enrich, partition, capture, and ultimately sequence poly(A) site junctions. However, these methods often require poly(A) enrichment, 3´ linker ligation steps, and RNA fragmentation, which can necessitate higher levels of starting RNA, increase experimental error, and potentially introduce bias. We recently reported a click-chemistry based method for generating RNAseq libraries called “ClickSeq”. Here, we adapt this method to direct the cDNA synthesis specifically toward the 3´ UTR/poly(A) tail junction of cellular RNA. With this novel approach, we demonstrate sensitive and specific enrichment for poly(A) site junctions without the need for complex sample preparation, fragmentation or purification. Poly(A)-ClickSeq (PAC-seq) is therefore a simple procedure that generates high-quality RNA-seq poly(A) libraries. As a proof-of-principle, we utilized PAC-seq to explore the poly(A) landscape of both human and Drosophila cells in culture and observed outstanding overlap with existing poly(A) databases and also identified previously unannotated poly(A) sites. Moreover, we utilize PAC-seq to quantify and analyze APA events regulated by CFIm25 illustrating how this technology can be harnessed to identify alternatively polyadenylated RNA.

Author(s):  
Wenbin Ye ◽  
Tao Liu ◽  
Hongjuan Fu ◽  
Congting Ye ◽  
Guoli Ji ◽  
...  

Abstract Motivation Alternative polyadenylation (APA) has been widely recognized as a widespread mechanism modulated dynamically. Studies based on 3′ end sequencing and/or RNA-seq have profiled poly(A) sites in various species with diverse pipelines, yet no unified and easy-to-use toolkit is available for comprehensive APA analyses. Results We developed an R package called movAPA for modeling and visualization of dynamics of alternative polyadenylation across biological samples. movAPA incorporates rich functions for preprocessing, annotation and statistical analyses of poly(A) sites, identification of poly(A) signals, profiling of APA dynamics and visualization. Particularly, seven metrics are provided for measuring the tissue-specificity or usages of APA sites across samples. Three methods are used for identifying 3′ UTR shortening/lengthening events between conditions. APA site switching involving non-3′ UTR polyadenylation can also be explored. Using poly(A) site data from rice and mouse sperm cells, we demonstrated the high scalability and flexibility of movAPA in profiling APA dynamics across tissues and single cells. Availability and implementation https://github.com/BMILAB/movAPA. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 20 (24) ◽  
pp. 6350 ◽  
Author(s):  
Nan Deng ◽  
Chen Hou ◽  
Fengfeng Ma ◽  
Caixia Liu ◽  
Yuxin Tian

The limitations of RNA sequencing make it difficult to accurately predict alternative splicing (AS) and alternative polyadenylation (APA) events and long non-coding RNAs (lncRNAs), all of which reveal transcriptomic diversity and the complexity of gene regulation. Gnetum, a genus with ambiguous phylogenetic placement in seed plants, has a distinct stomatal structure and photosynthetic characteristics. In this study, a full-length transcriptome of Gnetum luofuense leaves at different developmental stages was sequenced with the latest PacBio Sequel platform. After correction by short reads generated by Illumina RNA-Seq, 80,496 full-length transcripts were obtained, of which 5269 reads were identified as isoforms of novel genes. Additionally, 1660 lncRNAs and 12,998 AS events were detected. In total, 5647 genes in the G. luofuense leaves had APA featured by at least one poly(A) site. Moreover, 67 and 30 genes from the bHLH gene family, which play an important role in stomatal development and photosynthesis, were identified from the G. luofuense genome and leaf transcripts, respectively. This leaf transcriptome supplements the reference genome of G. luofuense, and the AS events and lncRNAs detected provide valuable resources for future studies of investigating low photosynthetic capacity of Gnetum.


2019 ◽  
Author(s):  
Annett Erkes ◽  
Stefanie Mücke ◽  
Maik Reschke ◽  
Jens Boch ◽  
Jan Grau

AbstractPlant-pathogenic Xanthomonas bacteria secret transcription activator-like effectors (TALEs) into host cells, where they act as transcriptional activators on plant target genes to support bacterial virulence. TALEs have a unique modular DNA-binding domain composed of tandem repeats. Two amino acids within each tandem repeat, termed repeat-variable diresidues, bind to contiguous nucleotides on the DNA sequence and determine target specificity.In this paper, we propose a novel approach for TALE target prediction to identify potential virulence targets. Our approach accounts for recent findings concerning TALE targeting, including frame-shift binding by repeats of aberrant lengths, and the flexible strand orientation of target boxes relative to the transcription start of the downstream target gene. The computational model can account for dependencies between adjacent RVD positions. Model parameters are learned from the wealth of quantitative data that have been generated over the last years.We benchmark the novel approach, termed PrediTALE, using RNA-seq data after Xanthomonas infection in rice, and find an overall improvement of prediction performance compared with previous approaches. Using PrediTALE, we are able to predict several novel putative virulence targets. However, we also observe that no target genes are predicted by any prediction tool for several TALEs, which we term orphan TALEs for this reason. We postulate that one explanation for orphan TALEs are incomplete gene annotations and, hence, propose to replace promoterome-wide by genome-wide scans for target boxes. We demonstrate that known targets from promoterome-wide scans may be recovered by genome-wide scans, whereas the latter, combined with RNA-seq data, are able to detect putative targets independent of existing gene annotations.Author summaryDiseases caused by plant-pathogenic Xanthomonas bacteria are a serious threat for many important crop plants including rice. Efficiently protecting plants from these pathogens requires a deeper understanding of infection strategies. For many Xanthomonas strains, such infection strategies depend on a special class of effector proteins, termed transcription activator-like effectors (TALEs). TALEs may specifically activate genes of the host plant and, by this means, re-program the plant cell for the benefit of the pathogen. Target sequences and, consequently, target genes of a specific TALE may be predicted computationally from its amino acids. Here, we propose a novel approach for TALE target prediction that makes use of several insights into TALE biology but also of broad experimental data gained over the last years. We demonstrate that this approach yields a higher prediction accuracy than previous approaches. We further postulate that a strategy change from a restricted search only considering promoters of annotated genes to a broad genome-wide search is feasible and yields novel targets including previously neglected protein-coding genes but also non-coding RNAs of possibly regulatory function.


2019 ◽  
Author(s):  
Daniel Selechnik ◽  
Mark F. Richardson ◽  
Richard Shine ◽  
Jayna DeVore ◽  
Simon Ducatez ◽  
...  

AbstractInvasive species often exhibit rapid evolution in their introduced ranges despite the genetic bottlenecks that are thought to accompany the translocation of small numbers of founders; however, some invasions may not fit this “genetic paradox.” The invasive cane toad (Rhinella marina) displays high phenotypic variation across its environmentally heterogeneous introduced Australian range. Here, we used three genome-wide datasets to characterize population structure and genetic diversity in invasive toads: RNA-Seq data generated from spleens sampled from the toads’ native range in French Guiana, the introduced population in Hawai’i that was the source of Australian founders, and Australia; RNA-Seq data generated from brains sampled more extensively in Hawai’i and Australia; and previously published RADSeq data from transects across Australia. We found that toads form three genetic clusters: (1) native range toads, (2) toads from the source population in Hawai’i and long-established areas near introduction sites in Australia, and (3) toads from more recently established northern Australian sites. In addition to strong divergence between native and invasive populations, we find evidence for a reduction in genetic diversity after introduction. However, we do not see this reduction in loci putatively under selection, suggesting that genetic diversity may have been maintained at ecologically relevant traits, or that mutation rates were high enough to maintain adaptive potential. Nonetheless, cane toads encounter novel environmental challenges in Australia and appear to respond to selection across environmental breaks; the transition between genetic clusters occurs at a point along the invasion transect where temperature rises and rainfall decreases. We identify loci known to be involved in resistance to heat and dehydration that show evidence of selection in Australian toads. Despite well-known predictions regarding genetic drift and spatial sorting during invasion, this study highlights that natural selection occurs rapidly and plays a vital role in shaping the structure of invasive populations.Author SummaryDespite longstanding evidence for the link between genetic diversity and population viability, the “genetic paradox” concept reflects the observation that invasive populations are successful in novel environments despite a putative reduction in genetic diversity. However, some recent studies have suggested that successful invasions may often occur due to an absence of obstacles such as genetic diversity loss or novel adaptive challenges. The recent emergence of genome-wide technologies provides us with the tools to study this question comprehensively by assessing both overall genetic diversity, and diversity of loci that underlie ecologically relevant traits. The invasive cane toad is a useful model because there is abundant phenotypic evidence of rapid adaptation during invasion. Our results suggest strong genetic divergence between native and invasive populations, and a reduction in overall genetic diversity; however, we do not see this reduction when solely assessing ecologically relevant loci. This could be for reasons that support or refute the genetic paradox. Further studies may provide perspectives from other systems, allowing us to explore how variables such as propagule size affect the fit of an invasion to the model of the paradox. Studying invasive species remains important due to their largely negative impacts on the environment and economy.


Cell Reports ◽  
2021 ◽  
Vol 34 (3) ◽  
pp. 108629
Author(s):  
Kathrin Leppek ◽  
Gun Woo Byeon ◽  
Kotaro Fujii ◽  
Maria Barna

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Karen R. Mifsud ◽  
Clare L. M. Kennedy ◽  
Silvia Salatino ◽  
Eshita Sharma ◽  
Emily M. Price ◽  
...  

AbstractGlucocorticoid hormones (GCs) — acting through hippocampal mineralocorticoid receptors (MRs) and glucocorticoid receptors (GRs) — are critical to physiological regulation and behavioural adaptation. We conducted genome-wide MR and GR ChIP-seq and Ribo-Zero RNA-seq studies on rat hippocampus to elucidate MR- and GR-regulated genes under circadian variation or acute stress. In a subset of genes, these physiological conditions resulted in enhanced MR and/or GR binding to DNA sequences and associated transcriptional changes. Binding of MR at a substantial number of sites however remained unchanged. MR and GR binding occur at overlapping as well as distinct loci. Moreover, although the GC response element (GRE) was the predominant motif, the transcription factor recognition site composition within MR and GR binding peaks show marked differences. Pathway analysis uncovered that MR and GR regulate a substantial number of genes involved in synaptic/neuro-plasticity, cell morphology and development, behavior, and neuropsychiatric disorders. We find that MR, not GR, is the predominant receptor binding to >50 ciliary genes; and that MR function is linked to neuronal differentiation and ciliogenesis in human fetal neuronal progenitor cells. These results show that hippocampal MRs and GRs constitutively and dynamically regulate genomic activities underpinning neuronal plasticity and behavioral adaptation to changing environments.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Verônica R. de Melo Costa ◽  
Julianus Pfeuffer ◽  
Annita Louloupi ◽  
Ulf A. V. Ørom ◽  
Rosario M. Piro

Abstract Background Introns are generally removed from primary transcripts to form mature RNA molecules in a post-transcriptional process called splicing. An efficient splicing of primary transcripts is an essential step in gene expression and its misregulation is related to numerous human diseases. Thus, to better understand the dynamics of this process and the perturbations that might be caused by aberrant transcript processing it is important to quantify splicing efficiency. Results Here, we introduce SPLICE-q, a fast and user-friendly Python tool for genome-wide SPLICing Efficiency quantification. It supports studies focusing on the implications of splicing efficiency in transcript processing dynamics. SPLICE-q uses aligned reads from strand-specific RNA-seq to quantify splicing efficiency for each intron individually and allows the user to select different levels of restrictiveness concerning the introns’ overlap with other genomic elements such as exons of other genes. We applied SPLICE-q to globally assess the dynamics of intron excision in yeast and human nascent RNA-seq. We also show its application using total RNA-seq from a patient-matched prostate cancer sample. Conclusions Our analyses illustrate that SPLICE-q is suitable to detect a progressive increase of splicing efficiency throughout a time course of nascent RNA-seq and it might be useful when it comes to understanding cancer progression beyond mere gene expression levels. SPLICE-q is available at: https://github.com/vrmelo/SPLICE-q


Horticulturae ◽  
2021 ◽  
Vol 7 (6) ◽  
pp. 149
Author(s):  
Chao Gong ◽  
Qiangqiang Pang ◽  
Zhiliang Li ◽  
Zhenxing Li ◽  
Riyuan Chen ◽  
...  

Under high temperature stress, a large number of proteins in plant cells will be denatured and inactivated. Meanwhile Hsfs and Hsps will be quickly induced to remove denatured proteins, so as to avoid programmed cell death, thus enhancing the thermotolerance of plants. Here, a comprehensive identification and analysis of the Hsf and Hsp gene families in eggplant under heat stress was performed. A total of 24 Hsf-like genes and 117 Hsp-like genes were identified from the eggplant genome using the interolog from Arabidopsis. The gene structure and motif composition of Hsf and Hsp genes were relatively conserved in each subfamily in eggplant. RNA-seq data and qRT-PCR analysis showed that the expressions of most eggplant Hsf and Hsp genes were increased upon exposure to heat stress, especially in thermotolerant line. The comprehensive analysis indicated that different sets of SmHsps genes were involved downstream of particular SmHsfs genes. These results provided a basis for revealing the roles of SmHsps and SmHsp for thermotolerance in eggplant, which may potentially be useful for understanding the thermotolerance mechanism involving SmHsps and SmHsp in eggplant.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Ryan Lusk ◽  
Evan Stene ◽  
Farnoush Banaei-Kashani ◽  
Boris Tabakoff ◽  
Katerina Kechris ◽  
...  

AbstractAnnotation of polyadenylation sites from short-read RNA sequencing alone is a challenging computational task. Other algorithms rooted in DNA sequence predict potential polyadenylation sites; however, in vivo expression of a particular site varies based on a myriad of conditions. Here, we introduce aptardi (alternative polyadenylation transcriptome analysis from RNA-Seq data and DNA sequence information), which leverages both DNA sequence and RNA sequencing in a machine learning paradigm to predict expressed polyadenylation sites. Specifically, as input aptardi takes DNA nucleotide sequence, genome-aligned RNA-Seq data, and an initial transcriptome. The program evaluates these initial transcripts to identify expressed polyadenylation sites in the biological sample and refines transcript 3′-ends accordingly. The average precision of the aptardi model is twice that of a standard transcriptome assembler. In particular, the recall of the aptardi model (the proportion of true polyadenylation sites detected by the algorithm) is improved by over three-fold. Also, the model—trained using the Human Brain Reference RNA commercial standard—performs well when applied to RNA-sequencing samples from different tissues and different mammalian species. Finally, aptardi’s input is simple to compile and its output is easily amenable to downstream analyses such as quantitation and differential expression.


2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Christian Secchi ◽  
Paola Benaglio ◽  
Francesca Mulas ◽  
Martina Belli ◽  
Dwayne Stupack ◽  
...  

Abstract Background Adult granulosa cell tumor (aGCT) is a rare type of stromal cell malignant cancer of the ovary characterized by elevated estrogen levels. aGCTs ubiquitously harbor a somatic mutation in FOXL2 gene, Cys134Trp (c.402C < G); however, the general molecular effect of this mutation and its putative pathogenic role in aGCT tumorigenesis is not completely understood. We previously studied the role of FOXL2C134W, its partner SMAD3 and its antagonist FOXO1 in cellular models of aGCT. Methods In this work, seeking more comprehensive profiling of FOXL2C134W transcriptomic effects, we performed an RNA-seq analysis comparing the effect of FOXL2WT/SMAD3 and FOXL2C134W/SMAD3 overexpression in an established human GC line (HGrC1), which is not luteinized, and bears normal alleles of FOXL2. Results Our data shows that FOXL2C134W/SMAD3 overexpression alters the expression of 717 genes. These genes include known and novel FOXL2 targets (TGFB2, SMARCA4, HSPG2, MKI67, NFKBIA) and are enriched for neoplastic pathways (Proteoglycans in Cancer, Chromatin remodeling, Apoptosis, Tissue Morphogenesis, Tyrosine Kinase Receptors). We additionally expressed the FOXL2 antagonistic Forkhead protein, FOXO1. Surprisingly, overexpression of FOXO1 mitigated 40% of the altered genome-wide effects specifically related to FOXL2C134W, suggesting it can be a new target for aGCT treatment. Conclusions Our transcriptomic data provide novel insights into potential genes (FOXO1 regulated) that could be used as biomarkers of efficacy in aGCT patients.


Sign in / Sign up

Export Citation Format

Share Document