exon junctions
Recently Published Documents


TOTAL DOCUMENTS

68
(FIVE YEARS 18)

H-INDEX

18
(FIVE YEARS 2)

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Christopher Wilks ◽  
Shijie C. Zheng ◽  
Feng Yong Chen ◽  
Rone Charles ◽  
Brad Solomon ◽  
...  

AbstractWe present recount3, a resource consisting of over 750,000 publicly available human and mouse RNA sequencing (RNA-seq) samples uniformly processed by our new analysis pipeline. To facilitate access to the data, we provide the and R/Bioconductor packages as well as complementary web resources. Using these tools, data can be downloaded as study-level summaries or queried for specific exon-exon junctions, genes, samples, or other features. can be used to process local and/or private data, allowing results to be directly compared to any study in recount3. Taken together, our tools help biologists maximize the utility of publicly available RNA-seq data, especially to improve their understanding of newly collected data. recount3 is available from http://rna.recount.bio.


2021 ◽  
Author(s):  
Talia Ishfaq ◽  
Maryam Salik ◽  
Uzma Jafry ◽  
Zaain Ahmad ◽  
James D. Fackenthal

2021 ◽  
Author(s):  
Christopher Wilks ◽  
Shijie C. Zheng ◽  
Feng Yong Chen ◽  
Rone Charles ◽  
Brad Solomon ◽  
...  

We present recount3, a resource consisting of over 750,000 publicly available human and mouse RNA sequencing (RNA-seq) samples uniformly processed by our new Monorail analysis pipeline. To facilitate access to the data, we provide the recount3 and snapcount R/Bioconductor packages as well as complementary web resources. Using these tools, data can be downloaded as study-level summaries or queried for specific exon-exon junctions, genes, samples, or other features. Monorail can be used to process local and/or private data, allowing results to be directly compared to any study in recount3. Taken together, our tools help biologists maximize the utility of publicly available RNA-seq data, especially to improve their understanding of newly collected data. recount3 is available from http://rna.recount.bio.


2021 ◽  
Vol 22 (10) ◽  
pp. 5182
Author(s):  
Xiaoxin Liu ◽  
Jacqueline Frost ◽  
Anne Bowcock ◽  
Weixiong Zhang

(1) Background: Understanding the function of circular RNAs (circRNAs), a class of noncoding RNA, in psoriatic skin can provide important insights into the complex regulation of genes contributing to the pathogenesis of psoriasis. (2) Methods: A novel method was applied to RNA-seq datasets from 93 skin biopsy samples to comprehensively identify circRNAs of all types, i.e., canonical circRNAs from the intron-exon junctions of mRNAs and interior circRNAs (i-circRNAs) from the interior regions of exons, introns, and intergenic regions. Selected circRNAs were experimentally validated by qRT-PCR and Sanger sequencing. CircRNAs with abundant and differential expression were identified and their putative function as competing endogenous RNAs (ceRNAs) was analyzed by an integrated analysis of circRNAs, microRNAs, and mRNAs. (3) Results: With a comprehensive search using no information of splicing signals, we systematically identified 179 highly abundant circRNAs in psoriatic skin. Many of these were reported for the first time and many were differentially expressed in involved versus normal or uninvolved skin. Validation based on three additional RNA-seq datasets confirmed most of the identified circRNAs in psoriatic skin. Experimental analyses confirmed the expression of the well-known circRNA CDR1as, a canonical circRNA, and a novel i-circRNA in psoriasis. We also identified many circRNAs that may act as ceRNAs to regulate the expression of mRNA genes in psoriasis-related signaling pathways in psoriasis. (4) Conclusions: The result of the study suggested that circRNAs are abundant in psoriatic skin, have distinct characteristics, and contribute to psoriatic pathogenesis.


2021 ◽  
Author(s):  
Ayushi Rehman ◽  
Pratap Chandra ◽  
Kusum Kumari Singh

A central processing event in eukaryotic gene expression is splicing. Concurrent with splicing, the core-EJC proteins, eIF4A3 and RBM8A-MAGOH heterodimer are deposited 24 bases upstream of newly formed exon-exon junctions. One of the core-EJC proteins, MAGOH contains a paralog MAGOHB, and this paralog pair is conserved across vertebrates. Upon analysis of the splice variants of MAGOH-paralogs, we have found the presence of alternate protein isoforms which are also evolutionarily conserved. Further, comparison of the amino acid sequence of the principal and alternate protein isoforms has revealed absence of key amino acid residues in the alternate isoforms. The conservation of principal and alternate isoforms correlates to the importance of MAGOH and MAGOHB across vertebrates.


Author(s):  
Xiaowen Feng ◽  
Heng Li

Abstract LINE-1-mediated retrotransposition of protein-coding mRNAs is an active process in modern humans for both germline and somatic genomes. Prior works that surveyed human data mostly relied on detecting discordant mappings of paired-end short reads, or exon junctions contained in short reads. Moreover, there have been few genome-wide comparisons between gene retrocopies in great apes and humans. In this study, we introduced a more sensitive and accurate method to identify processed pseudogenes. Our method utilizes long-read assemblies, and more importantly, is able to provide full-length retrocopy sequences as well as flanking regions which are missed by short-read based methods. From 22 human individuals, we pinpointed 40 processed pseudogenes that are not present in the human reference genome GRCh38 and identified 17 pseudogenes that are in GRCh38 but absent from some input individuals. This represents a significantly higher discovery rate than previous reports (39 pseudogenes not in the reference genome out of 939 individuals). We also provided an overview of lineage-specific retrocopies in chimpanzee, gorilla, and orangutan genomes.


Author(s):  
Kenneth L Campbell ◽  
Nurit Haspel ◽  
Cassandra Gath ◽  
Nuzulul Kurniatash ◽  
Indira (Nouduri) Akkiraju ◽  
...  

Abstract This study explores the hypothesis that protein hormones are nested information systems in which initial products of gene transcription, and their subsequent protein fragments, before and after secretion and initial target cell action, play additional physiological regulatory roles. The study produced four tools and key results: 1) a problem approach that proceeds, with examples and suggestions for in vivo organismal functional tests for peptide-protein interactions, from proteolytic breakdown prediction to models of hormone fragment modulation of protein–protein binding motifs in unrelated proteins; 2) a catalog of 461 known soluble human protein hormones and their predicted fragmentation patterns; 3) an analysis of the predicted proteolytic patterns of the canonical protein hormone transcripts demonstrating near-universal persistence of 9 ± 7 peptides of 8 ± 8 amino acids even after cleavage with 24 proteases from four protease classes; and, 4) a coincidence analysis of the predicted proteolysis locations and the 1939 exon junctions within the transcripts that shows an excess (P < 0.001) of predicted proteolysis within 10 residues, especially at the exonal junction (P < 0.01). It appears all protein hormone transcripts generate multiple fragments the size of peptide hormones or protein–protein binding domains that may alter intracellular or extracellular functions by acting as modulators of metabolic enzymes, transduction factors, protein binding proteins, or hormone receptors. High proteolytic frequency at exonal junctions suggests proteolysis has evolved, as a complement to gene exon fusion, to extract structures or functions within single exons or protein segments to simplify the genome by discarding archaic one-exon genes.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e10063
Author(s):  
Sam Humphrey ◽  
Alastair Kerr ◽  
Magnus Rattray ◽  
Caroline Dive ◽  
Crispin J. Miller

Molecular sequences carry information. Analysis of sequence conservation between homologous loci is a proven approach with which to explore the information content of molecular sequences. This is often done using multiple sequence alignments to support comparisons between homologous loci. These methods therefore rely on sufficient underlying sequence similarity with which to construct a representative alignment. Here we describe a method using a formal metric of information, surprisal, to analyse biological sub-sequences without alignment constraints. We applied our model to the genomes of five different species to reveal similar patterns across a panel of eukaryotes. As the surprisal of a sub-sequence is inversely proportional to its occurrence within the genome, the optimal size of the sub-sequences was selected for each species under consideration. With the model optimized, we found a strong correlation between surprisal and CG dinucleotide usage. The utility of our model was tested by examining the sequences of genes known to undergo splicing. We demonstrate that our model can identify biological features of interest such as known donor and acceptor sites. Analysis across all annotated coding exon junctions in Homo sapiens reveals the information content of coding exons to be greater than the surrounding intron regions, a consequence of increased suppression of the CG dinucleotide in intronic space. Sequences within coding regions proximal to exon junctions exhibited novel patterns within DNA and coding mRNA that are not a function of the encoded amino acid sequence. Our findings are consistent with the presence of secondary information encoding features such as DNA and RNA binding sites, multiplexed through the coding sequence and independent of the information required to define the corresponding amino-acid sequence. We conclude that surprisal provides a complementary methodology with which to locate regions of interest in the genome, particularly in situations that lack an appropriate multiple sequence alignment.


2020 ◽  
Author(s):  
Brian Joseph ◽  
Eric C. Lai

AbstractAccurate splice site selection is critical for fruitful gene expression. Here, we demonstrate the Drosophila EJC suppresses hundreds of functional cryptic splice sites (SS), even though majority of these bear weak splicing motifs and appear incompetent. Mechanistically, the EJC directly conceals splicing elements through position-specific recruitment, preventing SS definition. We note that intron removal using strong, canonical SS yields AG|GU signatures at exon-exon junctions. Unexpectedly, we discover that scores of these minimal exon junction sequences are in fact EJC-suppressed 5’ and 3’ recursive SS, and that loss of EJC regulation from such transcripts triggers faulty mRNA resplicing. An important corollary is that intronless cDNA expression constructs from aforementioned targets yield high levels of unanticipated, truncated transcripts generated by resplicing. Consequently, we conclude the EJC has ancestral roles to defend transcriptome fidelity by (1) repressing illegitimate splice sites on pre-mRNAs, and (2) preventing inadvertent activation of such sites on spliced segments.


Biomolecules ◽  
2020 ◽  
Vol 10 (6) ◽  
pp. 866 ◽  
Author(s):  
Lena P. Schlautmann ◽  
Niels H. Gehring

The exon junction complex (EJC) is an abundant messenger ribonucleoprotein (mRNP) component that is assembled during splicing and binds to mRNAs upstream of exon-exon junctions. EJCs accompany the mRNA during its entire life in the nucleus and the cytoplasm and communicate the information about the splicing process and the position of introns. Specifically, the EJC’s core components and its associated proteins regulate different steps of gene expression, including pre-mRNA splicing, mRNA export, translation, and nonsense-mediated mRNA decay (NMD). This review summarizes the most important functions and main protagonists in the life of the EJC. It also provides an overview of the latest findings on the assembly, composition and molecular activities of the EJC and presents them in the chronological order, in which they play a role in the EJC’s life cycle.


Sign in / Sign up

Export Citation Format

Share Document