structural variant
Recently Published Documents


TOTAL DOCUMENTS

206
(FIVE YEARS 84)

H-INDEX

25
(FIVE YEARS 6)

Author(s):  
Jianbo Zhang ◽  
Dilip R Panthee

Abstract Genomic regions that control traits of interest can be rapidly identified using BSA-Seq, a technology in which next-generation sequencing (NGS) is applied to bulked segregant analysis (BSA). We recently developed the significant structural variant method for BSA-Seq data analysis that exhibits higher detection power than standard BSA-Seq analysis methods. Our original algorithm was developed to analyze BSA-Seq data in which genome sequences of one parent served as the reference sequences in genotype calling, and thus required the availability of high-quality assembled parental genome sequences. Here we modified the original script to effectively detect the genomic region-trait associations using only bulk genome sequences. We analyzed two public BSA-Seq datasets using our modified method and the standard allele frequency and G-statistic methods with and without the aid of the parental genome sequences. Our results demonstrate that the genomic region(s) associated with the trait of interest could be reliably identified via the significant structural variant method without using the parental genome sequences.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Zhikun Wu ◽  
Zehang Jiang ◽  
Tong Li ◽  
Chuanbo Xie ◽  
Liansheng Zhao ◽  
...  

AbstractA complete characterization of genetic variation is a fundamental goal of human genome research. Long-read sequencing has improved the sensitivity of structural variant discovery. Here, we conduct the long-read sequencing-based structural variant analysis for 405 unrelated Chinese individuals, with 68 phenotypic and clinical measurements. We discover a landscape of 132,312 nonredundant structural variants, of which 45.2% are novel. The identified structural variants are of high-quality, with an estimated false discovery rate of 3.2%. The concatenated length of all the structural variants is approximately 13.2% of the human reference genome. We annotate 1,929 loss-of-function structural variants affecting the coding sequence of 1,681 genes. We discover rare deletions in HBA1/HBA2/HBB associated with anemia. Furthermore, we identify structural variants related to immunity which differentiate the northern and southern Chinese populations. Our study describes the landscape of structural variants in the Chinese population and their contribution to phenotypes and disease.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Qiuping Tan ◽  
Sen Li ◽  
Yuzheng Zhang ◽  
Min Chen ◽  
Binbin Wen ◽  
...  

AbstractPrunus species include many important perennial fruit crops, such as peach, plum, apricot, and related wild species. Here, we report de novo genome assemblies for five species, including the cultivated species peach (Prunus persica), plum (Prunus salicina), and apricot (Prunus armeniaca), and the wild peach species Tibetan peach (Prunus mira) and Chinese wild peach (Prunus davidiana). The genomes ranged from 240 to 276 Mb in size, with contig N50 values of 2.27−8.30 Mb and 25,333−27,826 protein-coding gene models. As the phylogenetic tree shows, plum diverged from its common ancestor with peach, wild peach species, and apricot ~7 million years ago (MYA). We analyzed whole-genome resequencing data of 417 peach accessions, called 3,749,618 high-quality SNPs, 577,154 small indels, 31,800 deletions, duplications, and inversions, and 32,338 insertions, and performed a structural variant-based genome-wide association study (GWAS) of key agricultural traits. From our GWAS data, we identified a locus associated with a fruit shape corresponding to the OVATE transcription factor, where a large inversion event correlates with higher OVATE expression in flat-shaped accessions. Furthermore, a GWAS revealed a NAC transcription factor associated with fruit developmental timing that is linked to a tandem repeat variant and elevated NAC expression in early-ripening accessions. We also identified a locus encoding microRNA172d, where insertion of a transposable element into its promoter was found in double-flower accessions. Thus, our efforts have suggested roles for OVATE, a NAC transcription factor, and microRNA172d in fruit shape, fruit development period, and floral morphology, respectively, that can be connected to traits in other crops, thereby demonstrating the importance of parallel evolution in the diversification of several commercially important domesticated species. In general, these genomic resources will facilitate functional genomics, evolutionary research, and agronomic improvement of these five and other Prunus species. We believe that structural variant-based GWASs can also be used in other plants, animal species, and humans and be combined with deep sequencing GWASs to precisely identify candidate genes and genetic architecture components.


GigaScience ◽  
2021 ◽  
Vol 10 (9) ◽  
Author(s):  
Lanying Wei ◽  
Martin Dugas ◽  
Sarah Sandmann

Abstract Background Artifact chimeric reads are enriched in next-generation sequencing data generated from formalin-fixed paraffin-embedded (FFPE) samples. Previous work indicated that these reads are characterized by erroneous split-read support that is interpreted as evidence of structural variants. Thus, a large number of false-positive structural variants are detected. To our knowledge, no tool is currently available to specifically call or filter structural variants in FFPE samples. To overcome this gap, we developed 2 R packages: SimFFPE and FilterFFPE. Results SimFFPE is a read simulator, specifically designed for next-generation sequencing data from FFPE samples. A mixture of characteristic artifact chimeric reads, as well as normal reads, is generated. FilterFFPE is a filtration algorithm, removing artifact chimeric reads from sequencing data while keeping real chimeric reads. To evaluate the performance of FilterFFPE, we performed structural variant calling with 3 common tools (Delly, Lumpy, and Manta) with and without prior filtration with FilterFFPE. After applying FilterFFPE, the mean positive predictive value improved from 0.27 to 0.48 in simulated samples and from 0.11 to 0.27 in real samples, while sensitivity remained basically unchanged or even slightly increased. Conclusions FilterFFPE improves the performance of SV calling in FFPE samples. It was validated by analysis of simulated and real data.


GigaScience ◽  
2021 ◽  
Vol 10 (9) ◽  
Author(s):  
Yilei Fu ◽  
Medhat Mahmoud ◽  
Viginesh Vaibhav Muraliraman ◽  
Fritz J Sedlazeck ◽  
Todd J Treangen

Abstract Background Long-read sequencing has enabled unprecedented surveys of structural variation across the entire human genome. To maximize the potential of long-read sequencing in this context, novel mapping methods have emerged that have primarily focused on either speed or accuracy. Various heuristics and scoring schemas have been implemented in widely used read mappers (minimap2 and NGMLR) to optimize for speed or accuracy, which have variable performance across different genomic regions and for specific structural variants. Our hypothesis is that constraining read mapping to the use of a single gap penalty across distinct mutational hot spots reduces read alignment accuracy and impedes structural variant detection. Findings We tested our hypothesis by implementing a read-mapping pipeline called Vulcan that uses two distinct gap penalty modes, which we refer to as dual-mode alignment. The high-level idea is that Vulcan leverages the computed normalized edit distance of the mapped reads via minimap2 to identify poorly aligned reads and realigns them using the more accurate yet computationally more expensive long-read mapper (NGMLR). In support of our hypothesis, we show that Vulcan improves the alignments for Oxford Nanopore Technology long reads for both simulated and real datasets. These improvements, in turn, lead to improved accuracy for structural variant calling performance on human genome datasets compared to either of the read-mapping methods alone. Conclusions Vulcan is the first long-read mapping framework that combines two distinct gap penalty modes for improved structural variant recall and precision. Vulcan is open-source and available under the MIT License at https://gitlab.com/treangenlab/vulcan.


2021 ◽  
Author(s):  
Vera B. Kaiser ◽  
Lana Talmane ◽  
Yatendra Kumar ◽  
Fiona Semple ◽  
Marie MacLennan ◽  
...  

Mutation in the germline is the ultimate source of genetic variation, but little is known about the influence of germline chromatin structure on mutational processes. Using ATAC-seq, we profile the open chromatin landscape of human spermatogonia, the most proliferative cell type of the germline, identifying transcription factor binding sites (TFBSs) and PRDM9 binding sites, a subset of which will initiate meiotic recombination. We observe an increase in rare structural variant (SV) breakpoints at PRDM9-bound sites, implicating meiotic recombination in the generation of structural variation. Many germline TFBSs, such as NRF1, are also associated with increased rates of SV breakpoints, apparently independent of recombination. Singleton short insertions (≥5 bp) are highly enriched at TFBSs, particularly at sites bound by testis active TFs, and their rates correlate with those of structural variant breakpoints. Short insertions often duplicate the TFBS motif, leading to clustering of motif sites near regulatory regions in this male-driven evolutionary process. Increased mutation loads at germline TFBSs disproportionately affect neural enhancers with activity in spermatogonia, potentially altering neurodevelopmental regulatory architecture. Local chromatin structure in spermatogonia is thus pervasive in shaping both evolution and disease.


2021 ◽  
Author(s):  
Ruining Dong ◽  
Daniel L Cameron ◽  
Justin Bedo ◽  
Anthony T Papenfuss

Background: The biological significance of structural variation is now more widely recognized. However, due to the lack of available tools for downstream analysis, including processing and annotating, interpretation of structural variant calls remains a challenge. Findings: Here we present svaRetro and svaNUMT, R packages that provide functions for annotating novel genomic events such as non-reference retro-copied transcripts and nuclear integration of mitochondrial DNA. We evaluate the performance of these packages to detect events using simulations and public benchmarking datasets, and annotate processed transcripts in a public structural variant database. Conclusions: svaRetro and svaNUMT provide efficient, modular tools for downstream identification and annotation of structural variant calls.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Daniel L. Cameron ◽  
Jonathan Baber ◽  
Charles Shale ◽  
Jose Espejo Valle-Inclan ◽  
Nicolle Besselink ◽  
...  

AbstractGRIDSS2 is the first structural variant caller to explicitly report single breakends—breakpoints in which only one side can be unambiguously determined. By treating single breakends as a fundamental genomic rearrangement signal on par with breakpoints, GRIDSS2 can explain 47% of somatic centromere copy number changes using single breakends to non-centromere sequence. On a cohort of 3782 deeply sequenced metastatic cancers, GRIDSS2 achieves an unprecedented 3.1% false negative rate and 3.3% false discovery rate and identifies a novel 32–100 bp duplication signature. GRIDSS2 simplifies complex rearrangement interpretation through phasing of structural variants with 16% of somatic calls phasable using paired-end sequencing.


Sign in / Sign up

Export Citation Format

Share Document