pcr duplicates
Recently Published Documents


TOTAL DOCUMENTS

35
(FIVE YEARS 15)

H-INDEX

6
(FIVE YEARS 1)

2022 ◽  
Vol 12 (1) ◽  
Author(s):  
Vera Belova ◽  
Anna Pavlova ◽  
Robert Afasizhev ◽  
Viktoriya Moskalenko ◽  
Margarita Korzhanova ◽  
...  

AbstractHuman exome sequencing is a classical method used in most medical genetic applications. The leaders in the field are the manufacturers of enrichment kits based on hybridization of cRNA or cDNA biotinylated probes specific for a genomic region of interest. Recently, the platforms manufactured by the Chinese company MGI Tech have become widespread in Europe and Asia. The reliability and quality of the obtained data are already beyond any doubt. However, only a few kits compatible with these sequencers can be used for such specific tasks as exome sequencing. We developed our own solution for library pre-capture pooling and exome enrichment with Agilent probes. In this work, using a set of the standard benchmark samples from the Platinum Genome collection, we demonstrate that the qualitative and quantitative parameters of our protocol which we called “RSMU_exome” exceed those of the MGI Tech kit. Our protocol allows for identifying more SNV and indels, generates fewer PCR duplicates, enables pooling of more samples in a single enrichment procedure, and requires less raw data to obtain results comparable with the MGI Tech's protocol. The cost of our protocol is also lower than that of MGI Tech's solution.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Asuka Hori ◽  
Hiroko Ogata-Kawata ◽  
Aiko Sasaki ◽  
Ken Takahashi ◽  
Kosuke Taniguchi ◽  
...  

Abstract Objective We aimed to simplify our fetal RHD genotyping protocol by changing the method to attach Illumina’s sequencing adaptors to PCR products from the ligation-based method to a PCR-based method, and to improve its reliability and robustness by introducing unique molecular indexes, which allow us to count the numbers of DNA fragments used as PCR templates and to minimize the effects of PCR and sequencing errors. Results Both of the newly established protocols reduced time and cost compared with our conventional protocol. Removal of PCR duplicates using UMIs reduced the frequencies of erroneously mapped sequences reads likely generated by PCR and sequencing errors. The modified protocols will help us facilitate implementing fetal RHD genotyping for East Asian populations into clinical practice.


Animals ◽  
2021 ◽  
Vol 11 (8) ◽  
pp. 2226
Author(s):  
Sazia Kunvar ◽  
Sylwia Czarnomska ◽  
Cino Pertoldi ◽  
Małgorzata Tokarska

The European bison is a non-model organism; thus, most of its genetic and genomic analyses have been performed using cattle-specific resources, such as BovineSNP50 BeadChip or Illumina Bovine 800 K HD Bead Chip. The problem with non-specific tools is the potential loss of evolutionary diversified information (ascertainment bias) and species-specific markers. Here, we have used a genotyping-by-sequencing (GBS) approach for genotyping 256 samples from the European bison population in Bialowieza Forest (Poland) and performed an analysis using two integrated pipelines of the STACKS software: one is de novo (without reference genome) and the other is a reference pipeline (with reference genome). Moreover, we used a reference pipeline with two different genomes, i.e., Bos taurus and European bison. Genotyping by sequencing (GBS) is a useful tool for SNP genotyping in non-model organisms due to its cost effectiveness. Our results support GBS with a reference pipeline without PCR duplicates as a powerful approach for studying the population structure and genotyping data of non-model organisms. We found more polymorphic markers in the reference pipeline in comparison to the de novo pipeline. The decreased number of SNPs from the de novo pipeline could be due to the extremely low level of heterozygosity in European bison. It has been confirmed that all the de novo/Bos taurus and Bos taurus reference pipeline obtained SNPs were unique and not included in 800 K BovineHD BeadChip.


2021 ◽  
Vol 3 (3) ◽  
Author(s):  
Arun H Patil ◽  
Marc K Halushka

Abstract MicroRNAs and tRFs are classes of small non-coding RNAs, known for their roles in translational regulation of genes. Advances in next-generation sequencing (NGS) have enabled high-throughput small RNA-seq studies, which require robust alignment pipelines. Our laboratory previously developed miRge and miRge2.0, as flexible tools to process sequencing data for annotation of miRNAs and other small-RNA species and further predict novel miRNAs using a support vector machine approach. Although miRge2.0 is a leading analysis tool in terms of speed with unique quantifying and annotation features, it has a few limitations. We present miRge3.0 that provides additional features along with compatibility to newer versions of Cutadapt and Python. The revisions of the tool include the ability to process Unique Molecular Identifiers (UMIs) to account for PCR duplicates while quantifying miRNAs in the datasets, correct erroneous single base substitutions in miRNAs with miREC and an accurate mirGFF3 formatted isomiR tool. miRge3.0 also has speed improvements benchmarked to miRge2.0, Chimira and sRNAbench. Finally, miRge3.0 output integrates into other packages for a streamlined analysis process and provides a cross-platform Graphical User Interface (GUI). In conclusion miRge3.0 is our third generation small RNA-seq aligner with improvements in speed, versatility and functionality over earlier iterations.


2021 ◽  
Vol 6 ◽  
pp. 141
Author(s):  
Oscar G Wilkins ◽  
Charlotte Capitanchik ◽  
Nicholas M. Luscombe ◽  
Jernej Ule

Background: The first step of virtually all next generation sequencing analysis involves the splitting of the raw sequencing data into separate files using sample-specific barcodes, a process known as “demultiplexing”. However, we found that existing software for this purpose was either too inflexible or too computationally intensive for fast, streamlined processing of raw, single end fastq files containing combinatorial barcodes. Results: Here, we introduce a fast and uniquely flexible demultiplexer, named Ultraplex, which splits a raw FASTQ file containing barcodes either at a single end or at both 5’ and 3’ ends of reads, trims the sequencing adaptors and low-quality bases, and moves unique molecular identifiers (UMIs) into the read header, allowing subsequent removal of PCR duplicates. Ultraplex is able to perform such single or combinatorial demultiplexing on both single- and paired-end sequencing data, and can process an entire Illumina HiSeq lane, consisting of nearly 500 million reads, in less than 20 minutes. Conclusions: Ultraplex greatly reduces computational burden and pipeline complexity for the demultiplexing of complex sequencing libraries, such as those produced by various CLIP and ribosome profiling protocols, and is also very user friendly, enabling streamlined, robust data processing. Ultraplex is available on PyPi and Conda and via Github.


Author(s):  
Alec Barrett ◽  
Rebecca McWhirter ◽  
Seth R Taylor ◽  
Alexis Weinreb ◽  
David M Miller ◽  
...  

Abstract A recent and powerful technique is to obtain transcriptomes from rare cell populations, such as single neurons in C. elegans, by enriching dissociated cells using fluorescent sorting. However, these cell samples often have low yields of RNA that present challenges in library preparation. This can lead to PCR duplicates, noisy gene expression for lowly expressed genes, and other issues that limit endpoint analysis. Further, some common resources, such as sequence specific kits for removing ribosomal RNA, are not optimized for non-mammalian samples. To advance library construction for such challenging samples, we compared two approaches for building RNAseq libraries from less than 10 nanograms of C. elegans RNA: SMARTSeq V4 (Takara), a widely used kit for selecting poly-adenylated transcripts; and SoLo Ovation (Tecan Genomics), a newly developed ribodepletion-based approach. For ribodepletion, we used a custom kit of 200 probes designed to match C. elegans rRNA gene sequences. We found that SoLo Ovation, in combination with our custom C. elegans probe set for rRNA depletion, detects an expanded set of noncoding RNAs, shows reduced noise in lowly expressed genes, and more accurately counts expression of long genes. The approach described here should be broadly useful for similar efforts to analyze transcriptomics when RNA is limiting.


2021 ◽  
Author(s):  
Vera Belova ◽  
Anna Pavlova ◽  
Robert Afasizhev ◽  
Viktoria Moskalenko ◽  
Margarita Korzhanova ◽  
...  

AbstractHuman whole exome sequencing (WES) is now the standard for most medical genetics applications worldwide. The leaders are manufacturers of enrichment kits that base their protocols on a hybridization approach using cRNA or cDNA biotinylated samples specific to regions of interest in the genome. Recently, platforms from the Chinese company MGI Tech have been successfully promoted in the markets of many countries in Europe and Asia. There is no longer any question about their reliability and the quality of the data obtained. However, very few task-specific kits for WES, in particular, are presented for these sequencers. We have developed our solution for library pre-capture pooling and exome enrichment using Agilent probes. In this work, we demonstrate on a set of standard benchmark samples from the Platinum Genome Collection that our protocol, called “RSMU_exome”, is superior to the kit from MGI Tech in qualitative and quantitative terms. It allows detecting more SNVs and CNVs with superior sensitivity and specificity values, generates fewer PCR duplicates, allows more samples to be pooled in a single enrichment, and requires less raw data to produce results comparable to the MGI Tech solution. Also, our protocol is significantly cheaper than the kit from the Chinese manufacturer.


2021 ◽  
Author(s):  
Arun H. Patil ◽  
Marc K. Halushka

ABSTRACTMicroRNAs and tRFs are classes of small non-coding RNAs, known for their roles in translational regulation of genes. Advances in next-generation sequencing (NGS) have enabled high-throughput small RNA-seq studies, which require robust alignment pipelines. Our laboratory previously developed miRge and miRge2.0, as flexible tools to process sequencing data for annotation of miRNAs and other small-RNA species and further predict novel miRNAs using a support vector machine approach. Although, miRge2.0 is a leading analysis tool in terms of speed with unique quantifying and annotation features, it has a few limitations. We present miRge3.0 which provides additional features along with compatibility to newer versions of Cutadapt and Python. The revisions of the tool include the ability to process Unique Molecular Identifiers (UMIs) to account for PCR duplicates while quantifying miRNAs in the datasets and an accurate GFF3 formatted isomiR tool. miRge3.0 also has speed improvements benchmarked to miRge2.0, Chimira and sRNAbench. Finally, miRge3.0 output integrates into other packages for a streamlined analysis process and provides a cross-platform Graphical User Interface (GUI). In conclusion miRge3.0 is our 3rd generation small RNA-seq aligner with improvements in speed, versatility, and functionality over earlier iterations.


2021 ◽  
Author(s):  
Alec Barrett ◽  
Rebecca McWhirter ◽  
Seth R Taylor ◽  
Alexis Weinreb ◽  
David M Miller ◽  
...  

ABSTRACTA recent and powerful technique is to obtain transcriptomes from rare cell populations, such as single neurons in C. elegans, by enriching dissociated cells using fluorescent sorting. However, these cell samples often have low yields of RNA that present challenges in library preparation. This can lead to PCR duplicates, noisy gene expression for lowly expressed genes, and other issues that limit endpoint analysis. Further, some common resources, such as sequence specific kits for removing ribosomal RNA, are not optimized for non-mammalian samples. To optimize library construction for such challenging samples, we compared two approaches for building RNAseq libraries from less than 10 nanograms of C. elegans RNA: SMARTSeq V4 (Takara), a widely used kit for selecting poly-adenylated transcripts; and SoLo Ovation (Tecan Genomics), a newly developed ribodepletion-based approach. For ribodepletion, we used a custom kit of 200 probes designed to match C. elegans rRNA gene sequences. We found that SoLo Ovation, in combination with our custom C. elegans probe set for rRNA depletion, detects an expanded set of noncoding RNAs, shows reduced noise in lowly expressed genes, and more accurately counts expression of long genes. The approach described here should be broadly useful for similar efforts to analyze transcriptomics when RNA is limiting.


2020 ◽  
Vol 3 (1) ◽  
Author(s):  
Tao Zhu ◽  
Keyan Liao ◽  
Rongfang Zhou ◽  
Chunjiao Xia ◽  
Weibo Xie

AbstractATAC-seq (Assay for Transposase-Accessible Chromatin with high-throughput sequencing) provides an efficient way to analyze nucleosome-free regions and has been applied widely to identify transcription factor footprints. Both applications rely on the accurate quantification of insertion events of the hyperactive transposase Tn5. However, due to the presence of the PCR amplification, it is impossible to accurately distinguish independently generated identical Tn5 insertion events from PCR duplicates using the standard ATAC-seq technique. Removing PCR duplicates based on mapping coordinates introduces increasing bias towards highly accessible chromatin regions. To overcome this limitation, we establish a UMI-ATAC-seq technique by incorporating unique molecular identifiers (UMIs) into standard ATAC-seq procedures. UMI-ATAC-seq can rescue about 20% of reads that are mistaken as PCR duplicates in standard ATAC-seq in our study. We demonstrate that UMI-ATAC-seq could more accurately quantify chromatin accessibility and significantly improve the sensitivity of identifying transcription factor footprints. An analytic pipeline is developed to facilitate the application of UMI-ATAC-seq, and it is available at https://github.com/tzhu-bio/UMI-ATAC-seq.


Sign in / Sign up

Export Citation Format

Share Document