reference alignment
Recently Published Documents


TOTAL DOCUMENTS

25
(FIVE YEARS 12)

H-INDEX

6
(FIVE YEARS 1)

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Tim W. McInerney ◽  
Brian Fulton-Howard ◽  
Christopher Patterson ◽  
Devashi Paliwal ◽  
Lars S. Jermiin ◽  
...  

Abstract Background Variation in mitochondrial DNA (mtDNA) identified by genotyping microarrays or by sequencing only the hypervariable regions of the genome may be insufficient to reliably assign mitochondrial genomes to phylogenetic lineages or haplogroups. This lack of resolution can limit functional and clinical interpretation of a substantial body of existing mtDNA data. To address this limitation, we developed and evaluated a large, curated reference alignment of complete mtDNA sequences as part of a pipeline for imputing missing mtDNA single nucleotide variants (mtSNVs). We call our reference alignment and pipeline MitoImpute. Results We aligned the sequences of 36,960 complete human mitochondrial genomes downloaded from GenBank, filtered and controlled for quality. These sequences were reformatted for use in imputation software, IMPUTE2. We assessed the imputation accuracy of MitoImpute by measuring haplogroup and genotype concordance in data from the 1000 Genomes Project and the Alzheimer’s Disease Neuroimaging Initiative (ADNI). The mean improvement of haplogroup assignment in the 1000 Genomes samples was 42.7% (Matthew’s correlation coefficient = 0.64). In the ADNI cohort, we imputed missing single nucleotide variants. Conclusion These results show that our reference alignment and panel can be used to impute missing mtSNVs in existing data obtained from using microarrays, thereby broadening the scope of functional and clinical investigation of mtDNA. This improvement may be particularly useful in studies where participants have been recruited over time and mtDNA data obtained using different methods, enabling better integration of early data collected using less accurate methods with more recent sequence data.


2021 ◽  
Vol 7 ◽  
pp. e602
Author(s):  
Xingsi Xue ◽  
Chao Jiang ◽  
Jie Zhang ◽  
Hai Zhu ◽  
Chaofan Yang

Sensors have been growingly used in a variety of applications. The lack of semantic information of obtained sensor data will bring about the heterogeneity problem of sensor data in semantic, schema, and syntax levels. To solve the heterogeneity problem of sensor data, it is necessary to carry out the sensor ontology matching process to determine correspondences among heterogeneous sensor concepts. In this paper, we propose a Siamese Neural Network based Ontology Matching technique (SNN-OM) to align the sensor ontologies, which does not require the utilization of reference alignment to train the network model. In particular, a representative concepts extraction method is presented to enhance the model’s performance and reduce the time of the training process, and an alignment refining method is proposed to enhance the alignments’ quality by removing the logically conflict correspondences. The experimental results show that SNN-OM is capable of efficiently determining high-quality sensor ontology alignments.


F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 143
Author(s):  
Travis L. Jensen ◽  
William F. Hooper ◽  
Sami R. Cherikh ◽  
Johannes B. Goll

Ribosomal profiling is an emerging experimental technology to measure protein synthesis by sequencing short mRNA fragments undergoing translation in ribosomes. Applied on the genome wide scale, this is a powerful tool to profile global protein synthesis within cell populations of interest. Such information can be utilized for biomarker discovery and detection of treatment-responsive genes. However, analysis of ribosomal profiling data requires careful preprocessing to reduce the impact of artifacts and dedicated statistical methods for visualizing and modeling the high-dimensional discrete read count data. Here we present Ribosomal Profiling Reports (RP-REP), a new open-source cloud-enabled software that allows users to execute start-to-end gene-level ribosomal profiling and RNA-Seq analysis on a pre-configured Amazon Virtual Machine Image (AMI) hosted on AWS or on the user’s own Ubuntu Linux server. The software works with FASTQ files stored locally, on AWS S3, or at the Sequence Read Archive (SRA). RP-REP automatically executes a series of customizable steps including filtering of contaminant RNA, enrichment of true ribosomal footprints, reference alignment and gene translation quantification, gene body coverage, CRAM compression, reference alignment QC, data normalization, multivariate data visualization, identification of differentially translated genes, and generation of heatmaps, co-translated gene clusters, enriched pathways, and other custom visualizations. RP-REP provides functionality to contrast RNA-SEQ and ribosomal profiling results, and calculates translational efficiency per gene. The software outputs a PDF report and publication-ready table and figure files. As a use case, we provide RP-REP results for a dengue virus study that tested cytosol and endoplasmic reticulum cellular fractions of human Huh7 cells pre-infection and at 6 h, 12 h, 24 h, and 40 h post-infection. Case study results, Ubuntu installation scripts, and the most recent RP-REP source code are accessible at GitHub. The cloud-ready AMI is available at AWS (AMI ID: RPREP RSEQREP (Ribosome Profiling and RNA-Seq Reports) v2.1 (ami-00b92f52d763145d3)).


Author(s):  
Tamir Bendory ◽  
Ariel Jaffe ◽  
William Leeb ◽  
Nir Sharon ◽  
Amit Singer

Abstract We study super-resolution multi-reference alignment, the problem of estimating a signal from many circularly shifted, down-sampled and noisy observations. We focus on the low SNR regime, and show that a signal in ${\mathbb{R}}^M$ is uniquely determined when the number $L$ of samples per observation is of the order of the square root of the signal’s length ($L=O(\sqrt{M})$). Phrased more informally, one can square the resolution. This result holds if the number of observations is proportional to $1/\textrm{SNR}^3$. In contrast, with fewer observations recovery is impossible even when the observations are not down-sampled ($L=M$). The analysis combines tools from statistical signal processing and invariant theory. We design an expectation-maximization algorithm and demonstrate that it can super-resolve the signal in challenging SNR regimes.


2020 ◽  
Author(s):  
Marc Gottschling ◽  
Lucas Czech ◽  
Frédéric Mahé ◽  
Sina Adl ◽  
Micah Dunthorn

ABSTRACTDinophytes are widely distributed in marine- and fresh-waters, but have yet to be conclusively documented in terrestrial environments. Here we evaluated the presence of these protists from an environmental DNA metabarcoding dataset of Neotropical rainforest soils. Using a phylogenetic placement approach with a reference alignment and tree, we showed that the numerous sequencing reads that were assigned to the dinophytes did not associate with taxonomy, environmental preference, nutritional mode, or dormancy. All the dinophytes in the soils are most likely windblown dispersal units of aquatic species, and are not biologically active residents of terrestrial environments.


Author(s):  
Omar Abou Saada ◽  
Andreas Tsouris ◽  
Anne Friedrich ◽  
Joseph Schacherer

AbstractWhile genome sequencing and assembly are now routine, we still do not have a full and precise picture of polyploid genomes. Phasing these genomes, i.e. deducing haplotypes from genomic data, remains a challenge. Despite numerous attempts, no existing polyploid phasing method provides accurate and contiguous haplotype predictions. To address this need, we developed nPhase, a ploidy agnostic pipeline and algorithm that leverage the accuracy of short reads and the length of long reads to solve reference alignment-based phasing for samples of unspecified ploidy (https://github.com/nPhasePipeline/nPhase). nPhase was validated on virtually constructed polyploid genomes of the model species Saccharomyces cerevisiae, generated by combining sequencing data of homozygous isolates. nPhase obtained on average >95% accuracy and a contiguous 1.25 haplotigs per haplotype to cover >90% of each chromosome (heterozygosity rate ≥0.5%). This new phasing method opens the door to explore polyploid genomes through applications such as population genomics and hybrid studies.


2020 ◽  
Vol 2 (1) ◽  
pp. 25-75
Author(s):  
Afonso Bandeira ◽  
Jonathan Niles-Weed ◽  
Philippe Rigollet
Keyword(s):  

2020 ◽  
Vol 35 ◽  
Author(s):  
Daniel Faria ◽  
Alfio Ferrara ◽  
Ernesto Jiménez-ruiz ◽  
Stefano Montanelli ◽  
Catia Pesquita

Abstract The quality of a dataset used for evaluating data linking methods, techniques, and tools depends on the availability of a set of mappings, called reference alignment, that is known to be correct. In particular, it is crucial that mappings effectively represent relations between pairs of entities that are indeed similar due to the fact that they denote the same object. Since the reliability of mappings is decisive in order to perform a fair evaluation of automatic linking methods and tools, we call this property of mappings as mapping fairness. In this article, we propose a crowd-based approach, called Crowd Quality (CQ), for assessing the quality of data linking datasets by measuring the fairness of the mappings in the reference alignment. Moreover, we present a real experiment, where we evaluate two state-of-the-art data linking tools before and after the refinement of the reference alignment based on the CQ approach, in order to present the benefits deriving from the crowd assessment of mapping fairness.


Jurnal Pari ◽  
2019 ◽  
Vol 4 (1) ◽  
pp. 33
Author(s):  
Arief Gunawan

Label punggung buku merupakan salah satu kegiatan manajemen perpustakaan untukmemudahkan pustakawan dalam penjajaran koleksi (shelving) pada rak-rak buku atau lemariyang tersedia agar koleksi dapat ditemukan kembali dengan mudah apabila ada pemustaka yangmencari,shelving sendiri lebih mudah dilakukan dengan menggunakan klasifikasi sebagai acuan penjajaran. Di perpustakaan Pusat Riset Perikanan label punggung buku dibuat denganmenggunakan aplikasi Slims Senayan yang menyediakan menu pencetakan label. Diharapkandengan menggunakan label punggung buku dengan tertib dapat memudahkan pustakawan dalammanajemen penempatan dan bisa menemukan kembali koleksi perpustakaan secara efektif dan efisienThe book back label is one of the library management activities to facilitate librarians in thecollection alignment (shelving) onthe bookshelves or cabinets available, so that the collectioncan be rediscovered easily when there are readers who search, shelving is easier to do by usingthe classification as a reference alignment. In the library of the Research Center of Fisheries thebook back label is created using by Slims Senayan application that provides the label printingmenu. It is hoped that using the book back label in an orderly manner facilitates librarians inplacement management and can rediscover library collections effective and efficiently.


Sign in / Sign up

Export Citation Format

Share Document