scholarly journals Genome Sequence Data of Peronophythora litchii, an Oomycete Pathogen Causing Litchi Downy Blight

Author(s):  
Hengyuan Guo ◽  
Jiandong Bao ◽  
Lianyu Lin ◽  
Zhixin Wang ◽  
Mingyue Shi ◽  
...  

Peronophythora litchii is an oomycete pathogen that exclusively infects litchi, with infection stages affecting a broad range of tissues. In this study, we obtained a near chromosome-level genome assembly of P. litchii strain ZL2018 from China using Oxford Nanopore Technologies (ONT) long-read sequencing and Illumina short-read sequencing. The genome assembly was 64.15 Mb in size and consisted of 81 contigs with an N50 of 1.43 Mb and a maximum length of 4.74 Mb. Excluding 34.67% of repeat sequences, a total of 14,857 protein-coding genes were identified, among which 14,447 genes were annotated. We also predicted 306 candidate RXLR effectors in the assembly. The high-quality genome assembly and annotation resources reported in this study will provide new insight into the infection mechanisms of P. litchii.

Plant Disease ◽  
2021 ◽  
Author(s):  
Zhixin Wang ◽  
Jiandong Bao ◽  
Lin Lv ◽  
Lianyu Lin ◽  
Zhiting Li ◽  
...  

Phytophthora colocasiae is a destructive oomycete pathogen of taro (Colocasia esculenta), which causes taro leaf blight. To date, only one highly fragmented Illumina short-read-based genome assembly is available for this species. To address this problem, we sequenced strain Lyd2019 from China using Oxford Nanopore Technologies (ONT) long-read sequencing and Illumina short-read sequencing. We generated a 92.51-Mb genome assembly consisting of 105 contigs with an N50 of 1.70 Mb and a maximum length of 4.17 Mb. In the genome assembly, we identified 52.78% repeats and 18,322 protein-coding genes, of which 12,782 genes were annotated. We also identified 191 candidate RXLR effectors and 1 candidate CRN effectors. The updated near-chromosome genome assembly and annotation resources will provide a better understanding of the infection mechanisms of P. colocasiae.


Plant Disease ◽  
2021 ◽  
Author(s):  
Jiandong Bao ◽  
Q. Q. Wu ◽  
Jianqin Huang ◽  
Chuan-Qing Zhang

Botryosphaeria dothidea is a latent pathogen with global importance to woody plant health, which causes serious tree trunk cankers on Chinese hickory. To date, only one Illumina short-read-based genome assembly of strain CK16 is available for host Chinese hickory. To address this problem, we reported a near telomere-to-telomere genome assembly of strain BDLA16-7 (46.05 Mb, N50 3.87 Mb) using Oxford Nanopore Sequencing Technology. Our genome assembly was consisted of 15 contigs, of which, 3 were assembled into chromosomal level and the maximum contig length was 6.19 Mb. The assembly contained 7.96% repeats and 12,815 protein-coding genes (10,274 genes were functional annotated). We also identified 3,642 pathogen-host interaction (PHI) genes, 250 carbohydrateactive enzymes (CAZymes), 252 cytochrome P450 enzymes (CYPs), 752 putative secreted proteins and 63 secondary metabolite biosynthesis gene clusters (SMBGCs). The BUSCO completeness of genome assembly and predicted genes was 99.34% and 97.50%, respectively, at fungal level (n=758). The almost chromosomal-level and well-annotated genome assembly will provide a valuable genetic resource for understanding of the infection mechanisms of B. dothidea in future.


2021 ◽  
Author(s):  
Gábor Torma ◽  
Dóra Tombácz ◽  
Norbert Moldován ◽  
Ádám Fülöp ◽  
István Prazsák ◽  
...  

Abstract In this study, we used two long-read sequencing (LRS) techniques, Sequel from the Pacific Biosciences and MinION from Oxford Nanopore Technologies, for the transcriptional characterization of a prototype baculovirus, Autographacalifornica multiple nucleopolyhedrovirus. LRS is able to read full-length RNA molecules, and thereby to distinguish between transcript isoforms, mono- and polycistronic RNAs, and overlapping transcripts. Altogether, we detected 875 transcripts, of which 759 are novel and 116 have been annotated previously. These RNA molecules include 41 novel putative protein coding transcript (each containing 5’-truncated in-frame ORFs), 14 monocistronic transcripts, 99 multicistronic RNAs, 101 non-coding RNA, and 504 length isoforms. We also detected RNA methylation in 12 viral genes and RNA hyper-editing in the longer 5’-UTR transcript isoform of ORF 19 gene.


2021 ◽  
Author(s):  
Arang Rhie ◽  
Ann Mc Cartney ◽  
Kishwar Shafin ◽  
Michael Alonge ◽  
Andrey Bzikadze ◽  
...  

Abstract Advances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first Telomere-to-Telomere (T2T) human genome assembly, which resolves complex segmental duplications and large tandem repeats, including centromeric satellite arrays in a complete hydatidiform mole (CHM13). Though derived from highly accurate sequencing, evaluation revealed that the initial T2T draft assembly had evidence of small errors and structural misassemblies. To correct these errors, we designed a novel repeat-aware polishing strategy that made accurate assembly corrections in large repeats without overcorrection, ultimately fixing 51% of the existing errors and improving the assembly QV to 73.9. By comparing our results to standard automated polishing tools, we outline common polishing errors and offer practical suggestions for genome projects with limited resources. We also show how sequencing biases in both PacBio HiFi and Oxford Nanopore Technologies reads cause signature assembly errors that can be corrected with a diverse panel of sequencing technologies


2021 ◽  
Author(s):  
R. Alan Harris ◽  
Muthuswamy Raveendran ◽  
Dustin T Lyfoung ◽  
Fritz J Sedlazeck ◽  
Medhat Mahmoud ◽  
...  

Background The Syrian hamster (Mesocricetus auratus) has been suggested as a useful mammalian model for a variety of diseases and infections, including infection with respiratory viruses such as SARS-CoV-2. The MesAur1.0 genome assembly was published in 2013 using whole-genome shotgun sequencing with short-read sequence data. Current more advanced sequencing technologies and assembly methods now permit the generation of near-complete genome assemblies with higher quality and higher continuity. Findings Here, we report an improved assembly of the M. auratus genome (BCM_Maur_2.0) using Oxford Nanopore Technologies long-read sequencing to produce a chromosome-scale assembly. The total length of the new assembly is 2.46 Gbp, similar to the 2.50 Gbp length of a previous assembly of this genome, MesAur1.0. BCM_Maur_2.0 exhibits significantly improved continuity with a scaffold N50 that is 6.7 times greater than MesAur1.0. Furthermore, 21,616 protein coding genes and 10,459 noncoding genes were annotated in BCM_Maur_2.0 compared to 20,495 protein coding genes and 4,168 noncoding genes in MesAur1.0. This new assembly also improves the unresolved regions as measured by nucleotide ambiguities, where approximately 17.11% of bases in MesAur1.0 were unresolved compared to BCM_Maur_2.0 in which the number of unresolved bases is reduced to 3.00%. Conclusions Access to a more complete reference genome with improved accuracy and continuity will facilitate more detailed, comprehensive, and meaningful research results for a wide variety of future studies using Syrian hamsters as models.


2021 ◽  
Vol 10 (22) ◽  
Author(s):  
Chanakya Pachi Pulusu ◽  
Balaram Khamari ◽  
Manmath Lama ◽  
Arun Sai Kumar Peketi ◽  
Prakash Kumar ◽  
...  

The draft genome of pandrug-resistant Pseudomonas aeruginosa strain SPA03, which belongs to global high-risk sequence type 357 (ST357) and was isolated from a patient with benign prostatic hyperplasia, is presented in this report. The genome assembly was generated by combining short-read Illumina HiSeq-X Ten and long-read Oxford Nanopore Technologies MinION sequence data using the Unicycler assembler.


Plant Disease ◽  
2021 ◽  
Author(s):  
Shiqin Zheng ◽  
Ruiqi Chen ◽  
Zhe Wang ◽  
Juan Liu ◽  
Yan Cai ◽  
...  

Tea grey blight is one of the most serious foliar diseases of tea tree caused by the plant pathogenic fungus Pseudopestalotiopsis theae which can affect production and quality of tea worldwide. We generated a highly contiguous, 50.41Mbp genome assembly (N50 1.30 Mbp) of P. theae strain CYF27 by combining PacBio long-read and Illumina short-read sequencing technologies. We identified a total of 15,626 gene models, of which 1,038 genes encode putative secreted proteins. The high-quality genome assembly and annotation resource reported here will be useful for the study of fungal infection mechanisms and pathogen-host interaction.


Author(s):  
Ying-Feng Niu ◽  
Guo-Hua Li ◽  
Shu-Bang Ni ◽  
Xi-Yong He ◽  
Cheng Zheng ◽  
...  

AbstractMacadamia is a kind of evergreen nut trees which belong to the Proteaceae family. The two commercial macadamia species, Macadamia integrifolia and M. tetraphylla, are highly prized for their edible kernels. Catherine et al. reported M. integrifolia genome using NGS sequencing technology. However, the lack of a high-quality assembly for M. tetraphylla hinders the progress in biological research and breeding program. In this study, we report a high-quality genome sequence of M. tetraphylla using the Oxford Nanopore Technologies (ONT) technology. We generated an assembly of 750.54 Mb with a contig N50 length of 1.18 Mb, which is close to the size estimated by flow cytometry and k-mer analysis. Repetitive sequence represent 58.57% of the genome sequence, which is strikingly higher compared with M. integrifolia. A total of 31,571 protein-coding genes were annotated with an average length of 6,055 bp, of which 92.59% were functionally annotated. The genome sequence of M. tetraphylla will provide novel insights into the breeding of novel strains and genetic improvement of agronomic traits.


BMC Genomics ◽  
2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Kumar Paritosh ◽  
Akshay Kumar Pradhan ◽  
Deepak Pental

Abstract Background Brassica nigra (BB), also called black mustard, is grown as a condiment crop in India. B. nigra represents the B genome of U’s triangle and is one of the progenitor species of B. juncea (AABB), an important oilseed crop of the Indian subcontinent. We report the genome assembly of B. nigra variety Sangam. Results The genome assembly was carried out using Oxford Nanopore long-read sequencing and optical mapping. A total of 1549 contigs were assembled, which covered ~ 515.4 Mb of the estimated ~ 522 Mb of the genome. The final assembly consisted of 15 scaffolds that were assigned to eight pseudochromosomes using a high-density genetic map of B. nigra. Around 246 Mb of the genome consisted of the repeat elements; LTR/Gypsy types of retrotransposons being the most predominant. The B genome-specific repeats were identified in the centromeric regions of the B. nigra pseudochromosomes. A total of 57,249 protein-coding genes were identified of which 42,444 genes were found to be expressed in the transcriptome analysis. A comparison of the B genomes of B. nigra and B. juncea revealed high gene colinearity and similar gene block arrangements. A comparison of the structure of the A, B, and C genomes of U’s triangle showed the B genome to be divergent from the A and C genomes for gene block arrangements and centromeric regions. Conclusions A highly contiguous genome assembly of the B. nigra genome reported here is an improvement over the previous short-read assemblies and has allowed a comparative structural analysis of the A, B, and C genomes of the species belonging to the U’s triangle. Based on the comparison, we propose a new nomenclature for B. nigra pseudochromosomes, taking the B. rapa pseudochromosome nomenclature as the reference.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Junwei Luo ◽  
Mengna Lyu ◽  
Ranran Chen ◽  
Xiaohong Zhang ◽  
Huimin Luo ◽  
...  

Abstract Background Scaffolding is an important step in genome assembly that orders and orients the contigs produced by assemblers. However, repetitive regions in contigs usually prevent scaffolding from producing accurate results. How to solve the problem of repetitive regions has received a great deal of attention. In the past few years, long reads sequenced by third-generation sequencing technologies (Pacific Biosciences and Oxford Nanopore) have been demonstrated to be useful for sequencing repetitive regions in genomes. Although some stand-alone scaffolding algorithms based on long reads have been presented, scaffolding still requires a new strategy to take full advantage of the characteristics of long reads. Results Here, we present a new scaffolding algorithm based on long reads and contig classification (SLR). Through the alignment information of long reads and contigs, SLR classifies the contigs into unique contigs and ambiguous contigs for addressing the problem of repetitive regions. Next, SLR uses only unique contigs to produce draft scaffolds. Then, SLR inserts the ambiguous contigs into the draft scaffolds and produces the final scaffolds. We compare SLR to three popular scaffolding tools by using long read datasets sequenced with Pacific Biosciences and Oxford Nanopore technologies. The experimental results show that SLR can produce better results in terms of accuracy and completeness. The open-source code of SLR is available at https://github.com/luojunwei/SLR. Conclusion In this paper, we describes SLR, which is designed to scaffold contigs using long reads. We conclude that SLR can improve the completeness of genome assembly.


Sign in / Sign up

Export Citation Format

Share Document