scholarly journals Genome assembly and annotation of Macadamia tetraphylla

Author(s):  
Ying-Feng Niu ◽  
Guo-Hua Li ◽  
Shu-Bang Ni ◽  
Xi-Yong He ◽  
Cheng Zheng ◽  
...  

AbstractMacadamia is a kind of evergreen nut trees which belong to the Proteaceae family. The two commercial macadamia species, Macadamia integrifolia and M. tetraphylla, are highly prized for their edible kernels. Catherine et al. reported M. integrifolia genome using NGS sequencing technology. However, the lack of a high-quality assembly for M. tetraphylla hinders the progress in biological research and breeding program. In this study, we report a high-quality genome sequence of M. tetraphylla using the Oxford Nanopore Technologies (ONT) technology. We generated an assembly of 750.54 Mb with a contig N50 length of 1.18 Mb, which is close to the size estimated by flow cytometry and k-mer analysis. Repetitive sequence represent 58.57% of the genome sequence, which is strikingly higher compared with M. integrifolia. A total of 31,571 protein-coding genes were annotated with an average length of 6,055 bp, of which 92.59% were functionally annotated. The genome sequence of M. tetraphylla will provide novel insights into the breeding of novel strains and genetic improvement of agronomic traits.

GigaScience ◽  
2020 ◽  
Vol 9 (9) ◽  
Author(s):  
Gina M Pham ◽  
John P Hamilton ◽  
Joshua C Wood ◽  
Joseph T Burke ◽  
Hainan Zhao ◽  
...  

Abstract Background Worldwide, the cultivated potato, Solanum tuberosum L., is the No. 1 vegetable crop and a critical food security crop. The genome sequence of DM1–3 516 R44, a doubled monoploid clone of S. tuberosum Group Phureja, was published in 2011 using a whole-genome shotgun sequencing approach with short-read sequence data. Current advanced sequencing technologies now permit generation of near-complete, high-quality chromosome-scale genome assemblies at minimal cost. Findings Here, we present an updated version of the DM1–3 516 R44 genome sequence (v6.1) using Oxford Nanopore Technologies long reads coupled with proximity-by-ligation scaffolding (Hi-C), yielding a chromosome-scale assembly. The new (v6.1) assembly represents 741.6 Mb of sequence (87.8%) of the estimated 844 Mb genome, of which 741.5 Mb is non-gapped with 731.2 Mb anchored to the 12 chromosomes. Use of Oxford Nanopore Technologies full-length complementary DNA sequencing enabled annotation of 32,917 high-confidence protein-coding genes encoding 44,851 gene models that had a significantly improved representation of conserved orthologs compared with the previous annotation. The new assembly has improved contiguity with a 595-fold increase in N50 contig size, 99% reduction in the number of contigs, a 44-fold increase in N50 scaffold size, and an LTR Assembly Index score of 13.56, placing it in the category of reference genome quality. The improved assembly also permitted annotation of the centromeres via alignment to sequencing reads derived from CENH3 nucleosomes. Conclusions Access to advanced sequencing technologies and improved software permitted generation of a high-quality, long-read, chromosome-scale assembly and improved annotation dataset for the reference genotype of potato that will facilitate research aimed at improving agronomic traits and understanding genome evolution.


2020 ◽  
pp. MPMI-08-20-0245
Author(s):  
Fangwei Yu ◽  
Wei Zhang ◽  
Shenyun Wang ◽  
Hong Wang ◽  
Li Yu ◽  
...  

Fusarium oxysporum f. sp. conglutinans is the causal agent of Fusarium wilt of cabbage (Brassica oleracea var. capitata L.), which results in severe yield loss. Here, we report a high-quality genome sequence of a race 1 strain (IVC-1) of F. oxysporum f. sp. conglutinans, which was assembled using a combination of PacBio long-read and Illumina short-read sequences. The assembled IVC-1 genome has a total size of 71.18 Mb, with a contig N50 length of 4.59 Mb, and encodes 23,374 predicted protein-coding genes. The high-quality genome of IVC-1 provides a valuable resource for facilitating our understanding of F. oxysporum f. sp. conglutinans–cabbage interaction. [Formula: see text] Copyright © 2020 The Author(s). This is an open access article distributed under the CC BY-NC-ND 4.0 International license .


2019 ◽  
Author(s):  
Qiuju Xia ◽  
Ru Zhang ◽  
Xuemei Ni ◽  
Lei Pan ◽  
Yangzi Wang ◽  
...  

AbstractAsparagus bean (Vigna. unguiculata ssp. sesquipedialis), known for its very long and tender green pods, is an important vegetable crop broadly grown in the developing countries. Despite its agricultural and economic values, asparagus bean does not have a high-quality genome assembly for breeding novel agronomic traits. In this study, we reported a high-quality 632.8 Mb assembly of asparagus bean based on the whole genome shotgun sequencing strategy. We also generated a high-density linkage map for asparagus bean, which helped anchor 94.42% of the scaffolds into 11 pseudo-chromosomes. A total of 42,609 protein-coding genes and 3,579 non-protein-coding genes were predicted from the assembly. Taken together, these genomic resources of asparagus bean will facilitate the investigation of economically valuable traits in a variety of legume species, so that the cultivation of these plants would help combat the protein and energy malnutrition in the developing world.


Author(s):  
Hengyuan Guo ◽  
Jiandong Bao ◽  
Lianyu Lin ◽  
Zhixin Wang ◽  
Mingyue Shi ◽  
...  

Peronophythora litchii is an oomycete pathogen that exclusively infects litchi, with infection stages affecting a broad range of tissues. In this study, we obtained a near chromosome-level genome assembly of P. litchii strain ZL2018 from China using Oxford Nanopore Technologies (ONT) long-read sequencing and Illumina short-read sequencing. The genome assembly was 64.15 Mb in size and consisted of 81 contigs with an N50 of 1.43 Mb and a maximum length of 4.74 Mb. Excluding 34.67% of repeat sequences, a total of 14,857 protein-coding genes were identified, among which 14,447 genes were annotated. We also predicted 306 candidate RXLR effectors in the assembly. The high-quality genome assembly and annotation resources reported in this study will provide new insight into the infection mechanisms of P. litchii.


2019 ◽  
Vol 8 (34) ◽  
Author(s):  
Anthony Wong ◽  
Ana Carolina M. Junqueira ◽  
Ankur Chaturvedi ◽  
Akira Uchida ◽  
Rikky W. Purbojati ◽  
...  

Pseudomonas sp. strain SGAir0191 was isolated from an air sample collected in Singapore, and its genome was sequenced using a combination of long and short reads to generate a high-quality genome assembly. The complete genome is approximately 5.07 Mb with 4,370 protein-coding genes, 19 rRNAs, and 73 tRNAs.


2021 ◽  
Vol 10 (43) ◽  
Author(s):  
Callum J. Bell ◽  
Johnny A. Sena ◽  
Isaac S. Gifford ◽  
Alison M. Berry

We report the genome sequence of Frankia sp. strain ArI3, recovered as a single contig from one run of the Oxford Nanopore Technologies (ONT) MinION instrument. The genome has a G+C content of 72%, is 7,541,222 bp long, and contains 5,427 predicted protein-coding genes.


2021 ◽  
Author(s):  
Gábor Torma ◽  
Dóra Tombácz ◽  
Norbert Moldován ◽  
Ádám Fülöp ◽  
István Prazsák ◽  
...  

Abstract In this study, we used two long-read sequencing (LRS) techniques, Sequel from the Pacific Biosciences and MinION from Oxford Nanopore Technologies, for the transcriptional characterization of a prototype baculovirus, Autographacalifornica multiple nucleopolyhedrovirus. LRS is able to read full-length RNA molecules, and thereby to distinguish between transcript isoforms, mono- and polycistronic RNAs, and overlapping transcripts. Altogether, we detected 875 transcripts, of which 759 are novel and 116 have been annotated previously. These RNA molecules include 41 novel putative protein coding transcript (each containing 5’-truncated in-frame ORFs), 14 monocistronic transcripts, 99 multicistronic RNAs, 101 non-coding RNA, and 504 length isoforms. We also detected RNA methylation in 12 viral genes and RNA hyper-editing in the longer 5’-UTR transcript isoform of ORF 19 gene.


Toxins ◽  
2018 ◽  
Vol 10 (12) ◽  
pp. 488 ◽  
Author(s):  
Shiyong Zhang ◽  
Jia Li ◽  
Qin Qin ◽  
Wei Liu ◽  
Chao Bian ◽  
...  

Naturally derived toxins from animals are good raw materials for drug development. As a representative venomous teleost, Chinese yellow catfish (Pelteobagrus fulvidraco) can provide valuable resources for studies on toxin genes. Its venom glands are located in the pectoral and dorsal fins. Although with such interesting biologic traits and great value in economy, Chinese yellow catfish is still lacking a sequenced genome. Here, we report a high-quality genome assembly of Chinese yellow catfish using a combination of next-generation Illumina and third-generation PacBio sequencing platforms. The final assembly reached 714 Mb, with a contig N50 of 970 kb and a scaffold N50 of 3.65 Mb, respectively. We also annotated 21,562 protein-coding genes, in which 97.59% were assigned at least one functional annotation. Based on the genome sequence, we analyzed toxin genes in Chinese yellow catfish. Finally, we identified 207 toxin genes and classified them into three major groups. Interestingly, we also expanded a previously reported sex-related region (to ≈6 Mb) in the achieved genome assembly, and localized two important toxin genes within this region. In summary, we assembled a high-quality genome of Chinese yellow catfish and performed high-throughput identification of toxin genes from a genomic view. Therefore, the limited number of toxin sequences in public databases will be remarkably improved once we integrate multi-omics data from more and more sequenced species.


2021 ◽  
Author(s):  
Masa-aki Yoshida ◽  
Kazuki Hirota ◽  
Junichi Imoto ◽  
Miki Okuno ◽  
Hiroyuki Tanaka ◽  
...  

The paper nautilus, Argonauta argo, also known as the greater argonaut, is a species of octopods distinctly characterized by its pelagic lifestyle and by the presence of a spiral-shaped shell-like eggcase in females. The eggcase functions by protecting the eggs laid inside it, and by building and keeping air intakes for buoyancy. To reveal the genomic background of the species′ adaptation to pelagic lifestyle and the acquisition of its shell-like eggcase, we sequenced the draft genome sequence of the species. The genome size was 1.1 Gb, which is the smallest among the cephalopods known to date, with the top 215 scaffolds (average length 5,064,479 bp) covering 81% (1.09 Gb) of the total assembly. A total of 26,433 protein-coding genes were predicted from 16,802 assembled scaffolds. From these, we identified nearly intact HOX, Parahox, Wnt clusters and some gene clusters probably related to the pelagic lifestyle, such as reflectin, tyrosinase, and opsin. For example, opsin might have undergone an extensive duplication in order to adapt to the pelagic lifestyle, as opposed to other octopuses, which are mostly the benthic. Our gene models also discovered several genes homologous to those related to calcified shell formation in Conchiferan Mollusks, such as Pif-like, SOD, and TRX. Interestingly, comparative genomics analysis revealed that the homologous genes for such genes were also found in the genome of the octopus, which does not have a shell, as well as the basal cephalopods Nautilus. Therefore, the draft genome sequence of A. argo we presented here had not only helped us to gain further insights into the genetic background of the dynamic recruitment and dismissal of genes for the formation of an important, converging extended phenotypic structure such as the shell and the shell-like eggcase, but also the evolution of lifestyles in Cephalopods and the octopods, from benthic to pelagic.


2020 ◽  
Vol 9 (2) ◽  
Author(s):  
Prasad Thomas ◽  
Mostafa Y. Abdel-Glil ◽  
Anne Busch ◽  
Lothar H. Wieler ◽  
Inga Eichhorn ◽  
...  

Clostridium limosum can be found in soil and the intestinal tract of animals. In 2014, C. limosum was isolated from a suspected blackleg outbreak in cattle in Schleswig-Holstein, Germany. We present a complete genome sequence of a C. limosum strain represented by a circular chromosome and three plasmids.


Sign in / Sign up

Export Citation Format

Share Document