scholarly journals The beginning of the end: a chromosomal assembly of the New World malaria mosquito ends with a novel telomere

Author(s):  
Austin Compton ◽  
Jiangtao Liang ◽  
Chujia Chen ◽  
Varvara Lukyanchikova ◽  
Yumin Qi ◽  
...  

ABSTRACTChromosome level assemblies are accumulating in various taxonomic groups including mosquitoes. However, even in the few reference-quality mosquito assemblies, a significant portion of the heterochromatic regions including telomeres remain unresolved. Here we produce a de novo assembly of the New World malaria mosquito, Anopheles albimanus by integrating Oxford Nanopore sequencing, Illumina, Hi-C and optical mapping. This 172.6 Mbps female assembly, which we call AalbS3, is obtained by scaffolding polished large contigs (contig N50=13.7 Mbps) into three chromosomes. All chromosome arms end with telomeric repeats, which is the first in mosquito assemblies and represents a significant step towards the completion of a genome assembly. These telomeres consist of tandem repeats of a novel 30-32 bp telomeric repeat unit (TRU) and are confirmed by analysing the termini of long reads and through both chromosomal in situ hybridization and a Bal31 sensitivity assay. The AalbS3 assembly included previously uncharacterized centromeric and rDNA clusters and more than doubled the content of transposable elements and other repetitive sequences. This telomere-to-telomere assembly, although still containing gaps, represents a significant step towards resolving biologically important but previously hidden genomic components. The comparison of different scaffolding methods will also inform future efforts to obtain reference-quality genomes for other mosquito species.100-word Article SummaryWe report AalbS3, a telomere-to-telomere assembly of the Anopheles albimanus genome produced by integrating advancing technologies including Oxford Nanopore and Bionano optical mapping. AalbS3 features much of the difficult-to-assemble genomic ‘dark matters’ including previously missed transposons, centromeres and rDNA clusters. We describe novel telomeric repeats that are confirmed by analysis of long reads and by telomere hybridization assays. This reference-quality assembly represents a significant step towards completing the genomic puzzle pieces and informs efforts to improve the assembly of other mosquito species. Future research into the relationship between telomere and mosquito life span may have significant implications to disease control.

2020 ◽  
Vol 10 (10) ◽  
pp. 3811-3819 ◽  
Author(s):  
Austin Compton ◽  
Jiangtao Liang ◽  
Chujia Chen ◽  
Varvara Lukyanchikova ◽  
Yumin Qi ◽  
...  

Chromosome level assemblies are accumulating in various taxonomic groups including mosquitoes. However, even in the few reference-quality mosquito assemblies, a significant portion of the heterochromatic regions including telomeres remain unresolved. Here we produce a de novo assembly of the New World malaria mosquito, Anopheles albimanus by integrating Oxford Nanopore sequencing, Illumina, Hi-C and optical mapping. This 172.6 Mbps female assembly, which we call AalbS3, is obtained by scaffolding polished large contigs (contig N50 = 13.7 Mbps) into three chromosomes. All chromosome arms end with telomeric repeats, which is the first in mosquito assemblies and represents a significant step toward the completion of a genome assembly. These telomeres consist of tandem repeats of a novel 30-32 bp Telomeric Repeat Unit (TRU) and are confirmed by analyzing the termini of long reads and through both chromosomal in situ hybridization and a Bal31 sensitivity assay. The AalbS3 assembly included previously uncharacterized centromeric and rDNA clusters and more than doubled the content of transposable elements and other repetitive sequences. This telomere-to-telomere assembly, although still containing gaps, represents a significant step toward resolving biologically important but previously hidden genomic components. The comparison of different scaffolding methods will also inform future efforts to obtain reference-quality genomes for other mosquito species.


2021 ◽  
Author(s):  
Jean-Marc Aury ◽  
Stefan Engelen ◽  
Benjamin Istace ◽  
Cécile Monat ◽  
Pauline Lasserre-Zuber ◽  
...  

AbstractThe sequencing of the wheat (Triticum aestivum) genome has been a methodological challenge for many years due to its large size (15.5 Gb), repeat content, and hexaploidy. Many initiatives aiming at obtaining a reference genome of cultivar Chinese Spring have been launched in the past years and it was achieved in 2018 as the result of a huge effort to combine short-read whole genome sequencing with many other resources. Reference-quality genome assemblies were then produced for other accessions but the rapid evolution of sequencing technologies offers opportunities to reach high-quality standards at lower cost. Here, we report on an optimized procedure based on long-reads produced on the ONT (Oxford Nanopore Technology) PromethION device to assemble the genome of the French bread wheat cultivar Renan. We provide the most contiguous and complete chromosome-scale assembly of a bread wheat genome to date, a resource that will be valuable for the crop community and will facilitate the rapid selection of agronomically important traits. We also provide the methodological standards to generate high-quality assemblies of complex genomes.


2018 ◽  
Author(s):  
Stephen M. J. Pollo ◽  
Sarah J. Reiling ◽  
Janneke Wit ◽  
Matthew L. Workentine ◽  
Rebecca A. Guy ◽  
...  

AbstractBackgroundGenomes of the parasite Giardia duodenalis are relatively small for eukaryotic genomes, yet there are only six publicly available. Difficulties in assembling the tetraploid G. duodenalis genome from short read sequencing data likely contribute to this lack of genomic information. We sequenced three isolates of G. duodenalis (AWB, BGS, and beaver) on the Oxford Nanopore Technologies MinION whose long reads have the potential to address genomic areas that are problematic for short reads.ResultsUsing a hybrid approach that combines MinION long reads and Illumina short reads to take advantage of the continuity of the long reads and the accuracy of the short reads we generated reference quality genomes for each isolate. The genomes for two of the isolates were evaluated against the available reference genomes for comparison. The third genome for which there is no previous data was then assembled. The long reads were used to find structural variants in each isolate to examine heterozygosity. Consistent with previous findings based on SNPs, Giardia BGS was found to be considerably more heterozygous than the other isolates that are from Assemblage A. We also find an enrichment of variant-specific surface proteins in some of the structural variant regions.ConclusionsOur results show that the MinION can be used to generate reference quality genomes in Giardia and further be used to identify structural variant regions that are an important source of genetic variation not previously examined in these parasites.


2021 ◽  
Vol 3 (2) ◽  
Author(s):  
Jean-Marc Aury ◽  
Benjamin Istace

Abstract Single-molecule sequencing technologies have recently been commercialized by Pacific Biosciences and Oxford Nanopore with the promise of sequencing long DNA fragments (kilobases to megabases order) and then, using efficient algorithms, provide high quality assemblies in terms of contiguity and completeness of repetitive regions. However, the error rate of long-read technologies is higher than that of short-read technologies. This has a direct consequence on the base quality of genome assemblies, particularly in coding regions where sequencing errors can disrupt the coding frame of genes. In the case of diploid genomes, the consensus of a given gene can be a mixture between the two haplotypes and can lead to premature stop codons. Several methods have been developed to polish genome assemblies using short reads and generally, they inspect the nucleotide one by one, and provide a correction for each nucleotide of the input assembly. As a result, these algorithms are not able to properly process diploid genomes and they typically switch from one haplotype to another. Herein we proposed Hapo-G (Haplotype-Aware Polishing Of Genomes), a new algorithm capable of incorporating phasing information from high-quality reads (short or long-reads) to polish genome assemblies and in particular assemblies of diploid and heterozygous genomes.


2021 ◽  
Vol 10 (41) ◽  
Author(s):  
Mariem Ben Khedher ◽  
Fredrik Nindo ◽  
Alicia Chevalier ◽  
Stéphane Bonacorsi ◽  
Gregory Dubourg ◽  
...  

We report here the complete genome sequences of three Bacillus cereus group strains isolated from blood cultures from premature and immunocompromised infants hospitalized in intensive care units in three French hospitals. These complete genome sequences were obtained from a combination of Illumina HiSeq X Ten short reads and Oxford Nanopore MinION long reads.


2021 ◽  
Author(s):  
Chi yang ◽  
Lu Ma ◽  
Donglai Xiao ◽  
Xiaoyu Liu ◽  
Xiaoling Jiang ◽  
...  

Sparassis latifolia is a valuable edible mushroom cultivated in China. In 2018, our research group reported an incomplete and low quality genome of S. latifolia was obtained by Illumina HiSeq 2500 sequencing. These limitations in the available genome have constrained genetic and genomic studies in this mushroom resource. Herein, an updated draft genome sequence of S. latifolia was generated by Oxford Nanopore sequencing and the Hi-C technique. A total of 8.24 Gb of Oxford Nanopore long reads representing ~198.08X coverage of the S. latifolia genome were generated. Subsequently, a high-quality genome of 41.41 Mb, with scaffold and contig N50 sizes of 3.31 Mb and 1.51 Mb, respectively, was assembled. Hi-C scaffolding of the genome resulted in 12 pseudochromosomes containing 93.56% of the bases in the assembled genome. Genome annotation further revealed that 17.47% of the genome was composed of repetitive sequences. In addition, 13,103 protein-coding genes were predicted, among which 98.72% were functionally annotated. BUSCO assay results further revealed that there were 92.07% complete BUSCOs. The improved chromosome-scale assembly and genome features described here will aid further molecular elucidation of various traits, breeding of S. latifolia, and evolutionary studies with related taxa.


2021 ◽  
Author(s):  
Yuanying Peng ◽  
Honghai Yan ◽  
Laichun Guo ◽  
Cao Deng ◽  
Lipeng Kang ◽  
...  

Abstract Common oat (Avena sativa) is one of the most important cereal crops serving as a valuable source of forage and human food. While reference genomes of many important crops have been generated, such work in oat has lagged behind, primarily owing to its large, repeat-rich, polyploid genome. By using Oxford Nanopore ultralong sequencing and Hi-C technologies, we have generated the first reference-quality genome assembly of hulless common oat with a contig N50 of 93 Mb. We also assembled the genomes of diploid and tetraploid Avena ancestors, which enabled us to identify oat subgenome, large-scale structural rearrangements, and preferential gene loss in the C subgenome after hexaploidization. Phylogenomic analyses of cereal crops indicated that the oat lineage descended before wheat, offering oat as a unique window into the early evolution of polyploid plants. The origin and evolution of hexaploid oat is deduced from whole-genome sequencing, plastid genome and transcriptomes assemblies of numerous Avena species. The high-quality reference genomes of Avena species with different ploidies and the studies of their polyploidization history will facilitate the full use of crop gene resources and provide a reference for the molecular mechanisms underlying the polyploidization of higher plants, helping us to overcome food security challenges.


Toxins ◽  
2020 ◽  
Vol 12 (9) ◽  
pp. 556
Author(s):  
Catherine Mwangi ◽  
Stephen Njoroge ◽  
Evariste Tshibangu-Kabamba ◽  
Zahir Moloo ◽  
Allan Rajula ◽  
...  

Helicobacter pylori (H.pylori) infection is etiologically associated with severe diseases including gastric cancer; but its pathogenicity is deeply shaped by the exceptional genomic diversification and geographic variation of the species. The clinical relevance of strains colonizing Africa is still debated. This study aimed to explore genomic features and virulence potentials of H. pylori KE21, a typical African strain isolated from a native Kenyan patient diagnosed with a gastric cancer. A high-quality circular genome assembly of 1,648,327 bp (1590 genes) obtained as a hybrid of Illumina Miseq short reads and Oxford Nanopore MinION long reads, clustered within hpAfrica1 population. This genome revealed a virulome and a mobilome encoding more than hundred features potentiating a successful colonization, persistent infection, and enhanced disease pathogenesis. Furthermore, through an experimental infection of gastric epithelial cell lines, strain KE21 showed the ability to promote interleukin-8 production and to induce cellular alterations resulting from the injection of a functional CagA oncogene protein into the cells. This study shows that strain KE21 is potentially virulent and can trigger oncogenic pathways in gastric epithelial cells. Expended genomic and clinical explorations are required to evaluate the epidemiological importance of H. pylori infection and its putative complications in the study population.


2020 ◽  
Vol 9 (37) ◽  
Author(s):  
Samuel O’Donnell ◽  
Frederic Chaux ◽  
Gilles Fischer

ABSTRACT The current Chlamydomonas reinhardtii reference genome remains fragmented due to gaps stemming from large repetitive regions. To overcome the vast majority of these gaps, publicly available Oxford Nanopore Technology data were used to create a new reference-quality de novo genome assembly containing only 21 contigs, 30/34 telomeric ends, and a genome size of 111 Mb.


GigaScience ◽  
2020 ◽  
Vol 9 (7) ◽  
Author(s):  
Sina Majidian ◽  
Fritz J Sedlazeck

Abstract Background The detection of which mutations are occurring on the same DNA molecule is essential to predict their consequences. This can be achieved by phasing the genomic variations. Nevertheless, state-of-the-art haplotype phasing is currently a black box in which the accuracy and quality of the reconstructed haplotypes are hard to assess. Findings Here we present PhaseME, a versatile method to provide insights into and improvement of sample phasing results based on linkage data. We showcase the performance and the importance of PhaseME by comparing phasing information obtained from Pacific Biosciences including both continuous long reads and high-quality consensus reads, Oxford Nanopore Technologies, 10x Genomics, and Illumina sequencing technologies. We found that 10x Genomics and Oxford Nanopore phasing can be significantly improved while retaining a high N50 and completeness of phase blocks. PhaseME generates reports and summary plots to provide insights into phasing performance and correctness. We observed unique phasing issues for each of the sequencing technologies, highlighting the necessity of quality assessments. PhaseME is able to decrease the Hamming error rate significantly by 22.4% on average across all 5 technologies. Additionally, a significant improvement is obtained in the reduction of long switch errors. Especially for high-quality consensus reads, the improvement is 54.6% in return for only a 5% decrease in phase block N50 length. Conclusions PhaseME is a universal method to assess the phasing quality and accuracy and improves the quality of phasing using linkage information. The package is freely available at https://github.com/smajidian/phaseme.


Sign in / Sign up

Export Citation Format

Share Document