scholarly journals BREC: an R package/Shiny app for automatically identifying heterochromatin boundaries and estimating local recombination rates along chromosomes

2021 ◽  
Vol 22 (S6) ◽  
Author(s):  
Yasmine Mansour ◽  
Annie Chateau ◽  
Anna-Sophie Fiston-Lavier

Abstract Background Meiotic recombination is a vital biological process playing an essential role in genome's structural and functional dynamics. Genomes exhibit highly various recombination profiles along chromosomes associated with several chromatin states. However, eu-heterochromatin boundaries are not available nor easily provided for non-model organisms, especially for newly sequenced ones. Hence, we miss accurate local recombination rates necessary to address evolutionary questions. Results Here, we propose an automated computational tool, based on the Marey maps method, allowing to identify heterochromatin boundaries along chromosomes and estimating local recombination rates. Our method, called BREC (heterochromatin Boundaries and RECombination rate estimates) is non-genome-specific, running even on non-model genomes as long as genetic and physical maps are available. BREC is based on pure statistics and is data-driven, implying that good input data quality remains a strong requirement. Therefore, a data pre-processing module (data quality control and cleaning) is provided. Experiments show that BREC handles different markers' density and distribution issues. Conclusions BREC's heterochromatin boundaries have been validated with cytological equivalents experimentally generated on the fruit fly Drosophila melanogaster genome, for which BREC returns congruent corresponding values. Also, BREC's recombination rates have been compared with previously reported estimates. Based on the promising results, we believe our tool has the potential to help bring data science into the service of genome biology and evolution. We introduce BREC within an R-package and a Shiny web-based user-friendly application yielding a fast, easy-to-use, and broadly accessible resource. The BREC R-package is available at the GitHub repository https://github.com/GenomeStructureOrganization.

2020 ◽  
Author(s):  
Yasmine Mansour ◽  
Annie Chateau ◽  
Anna-Sophie Fiston-Lavier

AbstractMotivationMeiotic recombination is a vital biological process playing an essential role in genomes structural and functional dynamics. Genomes exhibit highly various recombination profiles along chromosomes associated with several chromatin states. However, eu-heterochromatin boundaries are not available nor easily provided for non-model organisms, especially for newly sequenced ones. Hence, we miss accurate local recombination rates, necessary to address evolutionary questions.ResultsHere, we propose an automated computational tool, based on the Marey maps method, allowing to identify heterochromatin boundaries along chromosomes and estimating local recombination rates. Our method, called BREC (heterochromatin Boundaries and RECombination rate estimates) is non-genome-specific, running even on non-model genomes as long as genetic and physical maps are available. BREC is based on pure statistics and is data-driven, implying that good input data quality remains a strong requirement. Therefore, a data pre-processing module (data quality control and cleaning) is provided. Experiments show that BREC handles different markers density and distribution issues. BREC’s heterochromatin boundaries have been validated with cytological equivalents experimentally generated on the fruit fly Drosophila melanogaster genome, for which BREC returns congruent corresponding values. Also, BREC’s recombination rates have been compared with previously reported estimates. Based on the promising results, we believe our tool has the potential to help bring data science into the service of genome biology and evolution. We introduce BREC within an R-package and a Shiny web-based user-friendly application yielding a fast, easy-to-use, and broadly accessible resource.AvailabilityBREC R-package is available at the GitHub repository https://github.com/ymansour21/BREC.


2015 ◽  
Author(s):  
Alexander Zizka ◽  
Alexandre Antonelli

1. Large-scale species occurrence data from geo-referenced observations and collected specimens are crucial for analyses in ecology, evolution and biogeography. Despite the rapidly growing availability of such data, their use in evolutionary analyses is often hampered by tedious manual classification of point occurrences into operational areas, leading to a lack of reproducibility and concerns regarding data quality. 2. Here we present speciesgeocodeR, a user-friendly R-package for data cleaning, data exploration and data visualization of species point occurrences using discrete operational areas, and linking them to analyses invoking phylogenetic trees. 3. The three core functions of the package are 1) automated and reproducible data cleaning, 2) rapid and reproducible classification of point occurrences into discrete operational areas in an adequate format for subsequent biogeographic analyses, and 3) a comprehensive summary and visualization of species distributions to explore large datasets and ensure data quality. In addition, speciesgeocodeR facilitates the access and analysis of publicly available species occurrence data, widely used operational areas and elevation ranges. Other functionalities include the implementation of minimum occurrence thresholds and the visualization of coexistence patterns and range sizes. SpeciesgeocodeR accompanies a richly illustrated and easy-to-follow tutorial and help functions.


2017 ◽  
Author(s):  
Anob M. Chakrabarti ◽  
Nejc Haberman ◽  
Arne Praznik ◽  
Nicholas M. Luscombe ◽  
Jernej Ule

AbstractAn interplay of experimental and computational methods is required to achieve a comprehensive understanding of protein-RNA interactions. Crosslinking and immunoprecipitation (CLIP) identifies endogenous interactions by sequencing RNA fragments that co-purify with a selected RBP under stringent conditions. Here we focus on approaches for the analysis of resulting data and appraise the methods for peak calling, visualisation, analysis and computational modelling of protein-RNA binding sites. We advocate a combined assessment of cDNA complexity and specificity for data quality control. Moreover, we demonstrate the value of analysing sequence motif enrichment in peaks assigned from CLIP data, and of visualising RNA maps, which examine the positional distribution of peaks around regulated landmarks in transcripts. We use these to assess how variations in CLIP data quality, and in different peak calling methods, affect the insights into regulatory mechanisms. We conclude by discussing future opportunities for the computational analysis of protein-RNA interaction experiments.


2019 ◽  
Vol 5 ◽  
Author(s):  
Álvaro Briz-Redón

Spatial statistics is an important field of data science with many applications in very different areas of study such as epidemiology, criminology, seismology, astronomy and econometrics, among others. In particular, spatial statistics has frequently been used to analyze traffic accidents datasets with explanatory and preventive objectives. Traditionally, these studies have employed spatial statistics techniques at some level of areal aggregation, usually related to administrative units. However, last decade has brought an increasing number of works on the spatial incidence and distribution of traffic accidents at the road level by means of the spatial structure known as a linear network. This change seems positive because it could provide deeper and more accurate investigations than previous studies that were based on areal spatial units. The interest in working at the road level renders some technical difficulties due to the high complexity of these structures, specially in terms of manipulation and rectification. The R Shiny app SpNetPrep, which is available online and via an R package named the same way, has the goal of providing certain functionalities that could be useful for a user which is interested in performing an spatial analysis over a road network structure.


2019 ◽  
Vol 5 ◽  
Author(s):  
Giulio Genova ◽  
Mattia Rossi ◽  
Georg Niedrist ◽  
Stefano Della Chiesa

Meteo Browser South Tyrol is a user-friendly web-based application that helps to visualize and download the hydro-meteorological time series freely available in South Tyrol, Italy. It is designed for a wide range of users, from common citizens to students as well as researchers, private companies and the public administration. Meteo Browser South Tyrol is a Shiny App inside an R package and can be used on a local machine or accessed on-line. Drop down menus allow the user to select hydro-meteorological station and measurements. A simple map shows where the monitoring stations are, the latest measurements available, and lets the user subset the selected stations geographically by drawing a polygon.


2015 ◽  
Author(s):  
Alexandru Al. Ecovoiu ◽  
Iulian Constantin Ghionoiu ◽  
Andrei Mihai Ciuca ◽  
Attila Cristian Ratiu

A critical topic of insertional mutagenesis experiments performed on model organisms is mapping the hits of artificial transposons (ATs) at nucleotide level accuracy. Obviously, mapping errors may occur when sequencing artifacts or mutations as SNPs and small indels are present very close to the junction between a genomic sequence and a transposon inverted repeat (TIR). Another particular item of insertional mutagenesis is mapping of the transposon self-insertions and, to our best knowledge, there is no publicly available mapping tool designed to analyze such molecular events. We developed Genome ARTIST, a pairwise gapped aligner tool which works out both issues by means of an original, robust mapping strategy. Genome ARTIST is not designed to use NGS data but to analyze ATs insertions obtained in small to medium-scale mutagenesis experiments. Genome ARTIST employs a heuristic approach to find DNA sequence similarities and harnesses a multi-step implementation of a Smith-Waterman adapted algorithm to compute the mapping alignments. The experience is enhanced by easily customizable parameters and a user-friendly interface that describes the genomic landscape surrounding the insertion. Genome ARTIST deals with many genomes of bacteria and eukaryotes available in Ensembl and GenBank repositories. Our tool specifically harnesses/exploits the sequence annotation data provided by FlyBase for Drosophila melanogaster (the fruit fly), which enables mapping of insertions relative to various genomic features such as natural transposons. Genome ARTIST was tested against other alignment tools using relevant query sequences derived from the D. melanogaster and Mus musculus (mouse) genomes. Real and simulated query sequences were also comparatively inquired, revealing that Genome ARTIST is a very robust solution for mapping transposon insertions. Genome ARTIST is a stand-alone user-friendly application, designed for high-accuracy mapping of transposon insertions and self-insertions. The tool is also useful for routine aligning assessments like detection of SNPs or checking the specificity of primers and probes. Genome ARTIST is an open source software and is available for download at www.genomeartist.ro and at www.bioinformatics.org.


PLoS ONE ◽  
2021 ◽  
Vol 16 (12) ◽  
pp. e0262145
Author(s):  
Olatunji Johnson ◽  
Claudio Fronterre ◽  
Peter J. Diggle ◽  
Benjamin Amoah ◽  
Emanuele Giorgi

User-friendly interfaces have been increasingly used to facilitate the learning of advanced statistical methodology, especially for students with only minimal statistical training. In this paper, we illustrate the use of MBGapp for teaching geostatistical analysis to population health scientists. Using a case-study on Loa loa infections, we show how MBGapp can be used to teach the different stages of a geostatistical analysis in a more interactive fashion. For wider accessibility and usability, MBGapp is available as an R package and as a Shiny web-application that can be freely accessed on any web browser. In addition to MBGapp, we also present an auxiliary Shiny app, called VariagramApp, that can be used to aid the teaching of Gaussian processes in one and two dimensions using simulations.


2018 ◽  
Author(s):  
Nam D. Nguyen ◽  
Ian K. Blaby ◽  
Daifeng Wang

AbstractThe coordination of genome encoded function is a critical and complex process in biological systems, especially across phenotypes or states (e.g., time, disease, organism). Understanding how the complexity of genome-encoded function relates to these states remains a challenge. To address this, we have developed a novel computational method based on manifold learning and comparative analysis, ManiNetCluster, which simultaneously aligns and clusters multiple molecular networks to systematically reveal function links across multiple datasets. Specifically, ManiNetCluster employs manifold learning to match local and non-linear structures among the networks of different states, to identify cross-network linkages. By applying ManiNetCluster to the developmental gene expression datasets across model organisms (e.g., worm, fruit fly), we found that our tool significantly better aligns the orthologous genes than existing state-of-the-art methods, indicating the non-linear interactions between evolutionary functions in development. Moreover, we applied ManiNetCluster to a series of transcriptomes measured in the green alga Chlamy-domonas reinhardtii, to determine the function links between various metabolic processes between the light and dark periods of a diurnally cycling culture. For example, we identify a number of genes putatively regulating processes across each lighting regime, and how comparative analyses between ManiNetCluster and other clustering tools can provide additional insights. ManiNetCluster is available as an R package together with a tutorial at https://github.com/namtk/ManiNetCluster.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Gulden Olgun ◽  
Afshan Nabi ◽  
Oznur Tastan

Abstract Background While some non-coding RNAs (ncRNAs) are assigned critical regulatory roles, most remain functionally uncharacterized. This presents a challenge whenever an interesting set of ncRNAs needs to be analyzed in a functional context. Transcripts located close-by on the genome are often regulated together. This genomic proximity on the sequence can hint at a functional association. Results We present a tool, NoRCE, that performs cis enrichment analysis for a given set of ncRNAs. Enrichment is carried out using the functional annotations of the coding genes located proximal to the input ncRNAs. Other biologically relevant information such as topologically associating domain (TAD) boundaries, co-expression patterns, and miRNA target prediction information can be incorporated to conduct a richer enrichment analysis. To this end, NoRCE includes several relevant datasets as part of its data repository, including cell-line specific TAD boundaries, functional gene sets, and expression data for coding & ncRNAs specific to cancer. Additionally, the users can utilize custom data files in their investigation. Enrichment results can be retrieved in a tabular format or visualized in several different ways. NoRCE is currently available for the following species: human, mouse, rat, zebrafish, fruit fly, worm, and yeast. Conclusions NoRCE is a platform-independent, user-friendly, comprehensive R package that can be used to gain insight into the functional importance of a list of ncRNAs of any type. The tool offers flexibility to conduct the users’ preferred set of analyses by designing their own pipeline of analysis. NoRCE is available in Bioconductor and https://github.com/guldenolgun/NoRCE.


Sign in / Sign up

Export Citation Format

Share Document