scholarly journals Genomic structural variants constrain and facilitate adaptation in natural populations of Theobroma cacao, the chocolate tree

2021 ◽  
Vol 118 (35) ◽  
pp. e2102914118 ◽  
Author(s):  
Tuomas Hämälä ◽  
Eric K. Wafula ◽  
Mark J. Guiltinan ◽  
Paula E. Ralph ◽  
Claude W. dePamphilis ◽  
...  

Genomic structural variants (SVs) can play important roles in adaptation and speciation. Yet the overall fitness effects of SVs are poorly understood, partly because accurate population-level identification of SVs requires multiple high-quality genome assemblies. Here, we use 31 chromosome-scale, haplotype-resolved genome assemblies of Theobroma cacao—an outcrossing, long-lived tree species that is the source of chocolate—to investigate the fitness consequences of SVs in natural populations. Among the 31 accessions, we find over 160,000 SVs, which together cover eight times more of the genome than single-nucleotide polymorphisms and short indels (125 versus 15 Mb). Our results indicate that a vast majority of these SVs are deleterious: they segregate at low frequencies and are depleted from functional regions of the genome. We show that SVs influence gene expression, which likely impairs gene function and contributes to the detrimental effects of SVs. We also provide empirical support for a theoretical prediction that SVs, particularly inversions, increase genetic load through the accumulation of deleterious nucleotide variants as a result of suppressed recombination. Despite the overall detrimental effects, we identify individual SVs bearing signatures of local adaptation, several of which are associated with genes differentially expressed between populations. Genes involved in pathogen resistance are strongly enriched among these candidates, highlighting the contribution of SVs to this important local adaptation trait. Beyond revealing empirical evidence for the evolutionary importance of SVs, these 31 de novo assemblies provide a valuable resource for genetic and breeding studies in T. cacao.

2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Michael Alonge ◽  
Sebastian Soyk ◽  
Srividya Ramakrishnan ◽  
Xingang Wang ◽  
Sara Goodwin ◽  
...  

Abstract We present RaGOO, a reference-guided contig ordering and orienting tool that leverages the speed and sensitivity of Minimap2 to accurately achieve chromosome-scale assemblies in minutes. After the pseudomolecules are constructed, RaGOO identifies structural variants, including those spanning sequencing gaps. We show that RaGOO accurately orders and orients 3 de novo tomato genome assemblies, including the widely used M82 reference cultivar. We then demonstrate the scalability and utility of RaGOO with a pan-genome analysis of 103 Arabidopsis thaliana accessions by examining the structural variants detected in the newly assembled pseudomolecules. RaGOO is available open source at https://github.com/malonge/RaGOO.


2016 ◽  
Author(s):  
Zoe June Assaf ◽  
Susanne Tilk ◽  
Jane Park ◽  
Mark L. Siegal ◽  
Dmitri A. Petrov

AbstractMutations provide the raw material of evolution, and thus our ability to study evolution depends fundamentally on whether we have precise measurements of mutational rates and patterns. Here we explore the rates and patterns of mutations using i) de novo mutations from Drosophila melanogaster mutation accumulation lines and ii) polymorphisms segregating at extremely low frequencies. The first, mutation accumulation (MA) lines, are the product of maintaining flies in tiny populations for many generations, therefore rendering natural selection ineffective and allowing new mutations to accrue in the genome. In addition to generating a novel dataset of sequenced MA lines, we perform a meta-analysis of all published MA studies in D. melanogaster, which allows more precise estimates of mutational patterns across the genome. In the second half of this work, we identify polymorphisms segregating at extremely low frequencies using several publicly available population genomic data sets from natural populations of D. melanogaster. Extremely rare polymorphisms are difficult to detect with high confidence due to the problem of distinguishing them from sequencing error, however a dataset of true rare polymorphisms would allow the quantification of mutational patterns. This is due to the fact that rare polymorphisms, much like de novo mutations, are on average younger and also relatively unaffected by the filter of natural selection. We identify a high quality set of ~70,000 rare polymorphisms, fully validated with resequencing, and use this dataset to measure mutational patterns in the genome. This includes identifying a high rate of multi-nucleotide mutation events at both short (~5bp) and long (~1kb) genomic distances, showing that mutation drives GC content lower in already GC-poor regions, and finding that the context-dependency of the mutation spectrum predicts long-term evolutionary patterns at four-fold synonymous sites. We also show that de novo mutations from independent mutation accumulation experiments display similar patterns of single nucleotide mutation, and match well the patterns of mutation found in natural populations.


Author(s):  
Kamil S. Jaron ◽  
Darren J. Parker ◽  
Yoann Anselmetti ◽  
Patrick Tran Van ◽  
Jens Bast ◽  
...  

AbstractThe shift from sexual reproduction to parthenogenesis has occurred repeatedly in animals, but how the loss of sex affects genome evolution remains poorly understood. We generated de novo reference genomes for five independently evolved parthenogenetic species in the stick insect genus Timema and their closest sexual relatives. Using these references in combination with population genomic data, we show that parthenogenesis results in an extreme reduction of heterozygosity, and often leads to genetically uniform populations. We also find evidence for less effective positive selection in parthenogenetic species, supporting the view that sex is ubiquitous in natural populations because it facilitates fast rates of adaptation. Contrary to studies of non-recombining genome portions in sexual species, genomes of parthenogenetic species do not accumulate transposable elements (TEs), likely because successful parthenogens derive from sexual ancestors with inactive TEs. Because we are able to conduct replicated comparisons across five species pairs, our study reveals, for the first time, how animal genomes evolve in the absence of sex in natural populations, providing empirical support for the negative consequences of parthenogenetic reproduction as predicted by theory.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Sebastian Niehus ◽  
Hákon Jónsson ◽  
Janina Schönberger ◽  
Eythór Björnsson ◽  
Doruk Beyter ◽  
...  

AbstractThousands of genomic structural variants (SVs) segregate in the human population and can impact phenotypic traits and diseases. Their identification in whole-genome sequence data of large cohorts is a major computational challenge. Most current approaches identify SVs in single genomes and afterwards merge the identified variants into a joint call set across many genomes. We describe the approach PopDel, which directly identifies deletions of about 500 to at least 10,000 bp in length in data of many genomes jointly, eliminating the need for subsequent variant merging. PopDel scales to tens of thousands of genomes as we demonstrate in evaluations on up to 49,962 genomes. We show that PopDel reliably reports common, rare and de novo deletions. On genomes with available high-confidence reference call sets PopDel shows excellent recall and precision. Genotype inheritance patterns in up to 6794 trios indicate that genotypes predicted by PopDel are more reliable than those of previous SV callers. Furthermore, PopDel’s running time is competitive with the fastest tested previous tools. The demonstrated scalability and accuracy of PopDel enables routine scans for deletions in large-scale sequencing studies.


Science ◽  
2018 ◽  
Vol 360 (6393) ◽  
pp. eaar6343 ◽  
Author(s):  
Zev N. Kronenberg ◽  
Ian T. Fiddes ◽  
David Gordon ◽  
Shwetha Murali ◽  
Stuart Cantsilieris ◽  
...  

Genetic studies of human evolution require high-quality contiguous ape genome assemblies that are not guided by the human reference. We coupled long-read sequence assembly and full-length complementary DNA sequencing with a multiplatform scaffolding approach to produce ab initio chimpanzee and orangutan genome assemblies. By comparing these with two long-read de novo human genome assemblies and a gorilla genome assembly, we characterized lineage-specific and shared great ape genetic variation ranging from single– to mega–base pair–sized variants. We identified ~17,000 fixed human-specific structural variants identifying genic and putative regulatory changes that have emerged in humans since divergence from nonhuman apes. Interestingly, these variants are enriched near genes that are down-regulated in human compared to chimpanzee cerebral organoids, particularly in cells analogous to radial glial neural progenitors.


2019 ◽  
Author(s):  
Sebastian Niehus ◽  
Hákon Jónsson ◽  
Janina Schönberger ◽  
Eythór Björnsson ◽  
Doruk Beyter ◽  
...  

AbstractThousands of genomic structural variants segregate in the human population and can impact phenotypic traits and diseases. Their identification in whole-genome sequence data of large cohorts is a major computational challenge. We describe a novel approach, PopDel, which jointly identifies deletions of about 500 to at least 10,000 bp in length in many genomes together. PopDel scales to tens of thousands of genomes as we demonstrate in evaluations on up to 49,962 genomes. We show that PopDel reliably reports common, rare and de novo deletions. On genomes with available high-confidence reference call sets PopDel shows excellent recall and precision. Genotype inheritance patterns in up to 6,794 trios indicate that genotypes predicted by PopDel are more reliable than those of previous SV callers. Furthermore, PopDel’s running time is competitive with the fastest tested previous tools. The demonstrated scalability and accuracy of PopDel enables routine scans for deletions in large-scale sequencing studies.


2019 ◽  
Vol 11 (1) ◽  
Author(s):  
Sjors Middelkamp ◽  
Judith M. Vlaar ◽  
Jacques Giltay ◽  
Jerome Korzelius ◽  
Nicolle Besselink ◽  
...  

Abstract Background Genomic structural variants (SVs) can affect many genes and regulatory elements. Therefore, the molecular mechanisms driving the phenotypes of patients carrying de novo SVs are frequently unknown. Methods We applied a combination of systematic experimental and bioinformatic methods to improve the molecular diagnosis of 39 patients with multiple congenital abnormalities and/or intellectual disability harboring apparent de novo SVs, most with an inconclusive diagnosis after regular genetic testing. Results In 7 of these cases (18%), whole-genome sequencing analysis revealed disease-relevant complexities of the SVs missed in routine microarray-based analyses. We developed a computational tool to predict the effects on genes directly affected by SVs and on genes indirectly affected likely due to the changes in chromatin organization and impact on regulatory mechanisms. By combining these functional predictions with extensive phenotype information, candidate driver genes were identified in 16/39 (41%) patients. In 8 cases, evidence was found for the involvement of multiple candidate drivers contributing to different parts of the phenotypes. Subsequently, we applied this computational method to two cohorts containing a total of 379 patients with previously detected and classified de novo SVs and identified candidate driver genes in 189 cases (50%), including 40 cases whose SVs were previously not classified as pathogenic. Pathogenic position effects were predicted in 28% of all studied cases with balanced SVs and in 11% of the cases with copy number variants. Conclusions These results demonstrate an integrated computational and experimental approach to predict driver genes based on analyses of WGS data with phenotype association and chromatin organization datasets. These analyses nominate new pathogenic loci and have strong potential to improve the molecular diagnosis of patients with de novo SVs.


2019 ◽  
Author(s):  
Sjors Middelkamp ◽  
Judith M. Vlaar ◽  
Jacques Giltay ◽  
Jerome Korzelius ◽  
Nicolle Besselink ◽  
...  

AbstractBackgroundGenomic structural variants (SVs) can affect many genes and regulatory elements. Therefore, the molecular mechanisms driving the phenotypes of patients with multiple congenital abnormalities and/or intellectual disability carrying de novo SVs are frequently unknown.ResultsWe applied a combination of systematic experimental and bioinformatic methods to improve the molecular diagnosis of 39 patients with de novo SVs and an inconclusive diagnosis after regular genetic testing. In seven of these cases (18%) whole genome sequencing analysis detected disease-relevant complexities of the SVs missed in routine microarray-based analyses. We developed a computational tool to predict effects on genes directly affected by SVs and on genes indirectly affected due to changes in chromatin organization and impact on regulatory mechanisms. By combining these functional predictions with extensive phenotype information, candidate driver genes were identified in 16/39 (41%) patients. In eight cases evidence was found for involvement of multiple candidate drivers contributing to different parts of the phenotypes. Subsequently, we applied this computational method to a collection of 382 patients with previously detected and classified de novo SVs and identified candidate driver genes in 210 cases (54%), including 32 cases whose SVs were previously not classified as pathogenic. Pathogenic positional effects were predicted in 25% of the cases with balanced SVs and in 8% of the cases with copy number variants.ConclusionsThese results show that driver gene prioritization based on integrative analysis of WGS data with phenotype association and chromatin organization datasets can improve the molecular diagnosis of patients with de novo SVs.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
John T. Lovell ◽  
Nolan B. Bentley ◽  
Gaurab Bhattarai ◽  
Jerry W. Jenkins ◽  
Avinash Sreedasyam ◽  
...  

AbstractGenome-enabled biotechnologies have the potential to accelerate breeding efforts in long-lived perennial crop species. Despite the transformative potential of molecular tools in pecan and other outcrossing tree species, highly heterozygous genomes, significant presence–absence gene content variation, and histories of interspecific hybridization have constrained breeding efforts. To overcome these challenges, here, we present diploid genome assemblies and annotations of four outbred pecan genotypes, including a PacBio HiFi chromosome-scale assembly of both haplotypes of the ‘Pawnee’ cultivar. Comparative analysis and pan-genome integration reveal substantial and likely adaptive interspecific genomic introgressions, including an over-retained haplotype introgressed from bitternut hickory into pecan breeding pedigrees. Further, by leveraging our pan-genome presence–absence and functional annotation database among genomes and within the two outbred haplotypes of the ‘Lakota’ genome, we identify candidate genes for pest and pathogen resistance. Combined, these analyses and resources highlight significant progress towards functional and quantitative genomics in highly diverse and outbred crops.


Sign in / Sign up

Export Citation Format

Share Document