Metagenomic analysis of nepoviruses: diversity, evolution and identification of a genome region in members of subgroup A that appears to be important for host range

Archives of Virology ◽

10.1007/s00705-021-05111-0 ◽

2021 ◽

Author(s):

J. M. Hily ◽

N. Poulicard ◽

J. Kubina ◽

J. S. Reynard ◽

A. S. Spilmont ◽

...

Keyword(s):

Host Range ◽

High Throughput Sequencing ◽

Plant Viruses ◽

Metagenomic Analysis ◽

Phylogeographic Structure ◽

Sequence Information ◽

Arabis Mosaic Virus ◽

Genome Region ◽

Virus Diversity ◽

A Genome

AbstractData mining and metagenomic analysis of 277 open reading frame sequences of bipartite RNA viruses of the genus Nepovirus, family Secoviridae, were performed, documenting how challenging it can be to unequivocally assign a virus to a particular species, especially those in subgroups A and C, based on some of the currently adopted taxonomic demarcation criteria. This work suggests a possible need for their amendment to accommodate pangenome information. In addition, we revealed a host-dependent structure of arabis mosaic virus (ArMV) populations at a cladistic level and confirmed a phylogeographic structure of grapevine fanleaf virus (GFLV) populations. We also identified new putative recombination events in members of subgroups A, B and C. The evolutionary specificity of some capsid regions of ArMV and GFLV that were described previously and biologically validated as determinants of nematode transmission was circumscribed in silico. Furthermore, a C-terminal segment of the RNA-dependent RNA polymerase of members of subgroup A was predicted to be a putative host range determinant based on statistically supported higher π (substitutions per site) values for GFLV and ArMV isolates infecting Vitis spp. compared with non-Vitis-infecting ArMV isolates. This study illustrates how sequence information obtained via high-throughput sequencing can increase our understanding of mechanisms that modulate virus diversity and evolution and create new opportunities for advancing studies on the biology of economically important plant viruses.

Download Full-text

Metagenomic Analyses of Nepoviruses: Diversity, Evolution and Identification of a Hitherto Undescribed Putative Region for Host Range in Subgroup a Species

10.21203/rs.3.rs-171249/v1 ◽

2021 ◽

Author(s):

jean-michel hily ◽

Nils Poulicard ◽

Julie Kubina ◽

Jean-sebastien Reynard ◽

Anne-Sophie Spilmont ◽

...

Keyword(s):

Host Range ◽

High Throughput Sequencing ◽

Plant Viruses ◽

Terminal Segment ◽

Phylogeographic Structure ◽

Sequence Information ◽

Arabis Mosaic Virus ◽

Reading Frame ◽

Virus Diversity ◽

Demarcation Criteria

Abstract Datamining and metagenomic analyses of 277 open reading frame sequences of bipartite RNA viruses and variants in the genus Nepovirus documented how delicate it can be to unequivocally identify species, in particular subgroup A and C species, based on some of the currently adopted taxonomic demarcation criteria. It suggests a possible need for their amendment to accommodate pangenome information. In addition, we revealed a host-dependent structure of arabis mosaic virus (ArMV) populations at a cladistic level and confirmed a phylogeographic structure of grapevine fanleaf virus (GFLV) populations. We also identified new putative recombinant events for species of subgroups A, B and C. The evolutionary specificity of some capsid regions of ArMV and GFLV that were previously described and biologically validated as vector determinant was circumscribed in silico. Furthermore, a C-terminal segment of the RNA-dependent RNA polymerase of subgroup A species was predicted as a putative host range determinant based on statistically supported higher π values for GFLV and ArMV isolates infecting Vitis spp. compared to non-Vitis infecting ArMV isolates. This study illustrated how sequence information obtained via high throughput sequencing can increase our understanding of mechanisms that modulate virus diversity and evolution and create new opportunities for advancing studies on the biology of economically important plant viruses.

Download Full-text

Illuminating the Plant Rhabdovirus Landscape through Metatranscriptomics Data

Viruses ◽

10.3390/v13071304 ◽

2021 ◽

Vol 13 (7) ◽

pp. 1304

Author(s):

Nicolás Bejerman ◽

Ralf G. Dietzgen ◽

Humberto Debat

Keyword(s):

Plant Species ◽

High Throughput Sequencing ◽

Plant Viruses ◽

The Novel ◽

Coding Regions ◽

Public Data ◽

Invaluable Tool ◽

Sequencing Platforms ◽

Viral Sequences ◽

Plant Rhabdovirus

Rhabdoviruses infect a large number of plant species and cause significant crop diseases. They have a negative-sense, single-stranded unsegmented or bisegmented RNA genome. The number of plant-associated rhabdovirid sequences has grown in the last few years in concert with the extensive use of high-throughput sequencing platforms. Here, we report the discovery of 27 novel rhabdovirus genomes associated with 25 different host plant species and one insect, which were hidden in public databases. These viral sequences were identified through homology searches in more than 3000 plant and insect transcriptomes from the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) using known plant rhabdovirus sequences as the query. The identification, assembly and curation of raw SRA reads resulted in sixteen viral genome sequences with full-length coding regions and ten partial genomes. Highlights of the obtained sequences include viruses with unique and novel genome organizations among known plant rhabdoviruses. Phylogenetic analysis showed that thirteen of the novel viruses were related to cytorhabdoviruses, one to alphanucleorhabdoviruses, five to betanucleorhabdoviruses, one to dichorhaviruses and seven to varicosaviruses. These findings resulted in the most complete phylogeny of plant rhabdoviruses to date and shed new light on the phylogenetic relationships and evolutionary landscape of this group of plant viruses. Furthermore, this study provided additional evidence for the complexity and diversity of plant rhabdovirus genomes and demonstrated that analyzing SRA public data provides an invaluable tool to accelerate virus discovery, gain evolutionary insights and refine virus taxonomy.

Download Full-text

Molecular Characterization of Potato Virus Y (PVY) Using High-Throughput Sequencing: Constraints on Full Genome Reconstructions Imposed by Mixed Infection Involving Recombinant PVY Strains

Plants ◽

10.3390/plants10040753 ◽

2021 ◽

Vol 10 (4) ◽

pp. 753

Author(s):

Miroslav Glasa ◽

Richard Hančinský ◽

Katarína Šoltys ◽

Lukáš Predajňa ◽

Jana Tomašechová ◽

...

Keyword(s):

High Throughput ◽

Potato Virus ◽

Mixed Infection ◽

High Throughput Sequencing ◽

Potato Virus Y ◽

De Novo ◽

Sweet Pepper ◽

Complete Sequence ◽

Genome Region ◽

Mapping Parameters

In recent years, high throughput sequencing (HTS) has brought new possibilities to the study of the diversity and complexity of plant viromes. Mixed infection of a single plant with several viruses is frequently observed in such studies. We analyzed the virome of 10 tomato and sweet pepper samples from Slovakia, all showing the presence of potato virus Y (PVY) infection. Most datasets allow the determination of the nearly complete sequence of a single-variant PVY genome, belonging to one of the PVY recombinant strains (N-Wi, NTNa, or NTNb). However, in three to-mato samples (T1, T40, and T62) the presence of N-type and O-type sequences spanning the same genome region was documented, indicative of mixed infections involving different PVY strains variants, hampering the automated assembly of PVY genomes present in the sample. The N- and O-type in silico data were further confirmed by specific RT-PCR assays targeting UTR-P1 and NIa genomic parts. Although full genomes could not be de novo assembled directly in this situation, their deep coverage by relatively long paired reads allowed their manual re-assembly using very stringent mapping parameters. These results highlight the complexity of PVY infection of some host plants and the challenges that can be met when trying to precisely identify the PVY isolates involved in mixed infection.

Download Full-text

Deep learning predicts short non-coding RNA functions from only raw sequence data

PLoS Computational Biology ◽

10.1371/journal.pcbi.1008415 ◽

2020 ◽

Vol 16 (11) ◽

pp. e1008415

Author(s):

Teresa Maria Rosaria Noviello ◽

Francesco Ceccarelli ◽

Michele Ceccarelli ◽

Luigi Cerulo

Keyword(s):

Secondary Structure ◽

Sequence Data ◽

Computational Cost ◽

Large Data ◽

Sequence Information ◽

Structure Information ◽

Non Coding Rna ◽

A Genome ◽

Data Volume ◽

Biological Functionality

Small non-coding RNAs (ncRNAs) are short non-coding sequences involved in gene regulation in many biological processes and diseases. The lack of a complete comprehension of their biological functionality, especially in a genome-wide scenario, has demanded new computational approaches to annotate their roles. It is widely known that secondary structure is determinant to know RNA function and machine learning based approaches have been successfully proven to predict RNA function from secondary structure information. Here we show that RNA function can be predicted with good accuracy from a lightweight representation of sequence information without the necessity of computing secondary structure features which is computationally expensive. This finding appears to go against the dogma of secondary structure being a key determinant of function in RNA. Compared to recent secondary structure based methods, the proposed solution is more robust to sequence boundary noise and reduces drastically the computational cost allowing for large data volume annotations. Scripts and datasets to reproduce the results of experiments proposed in this study are available at: https://github.com/bioinformatics-sannio/ncrna-deep.

Download Full-text

Sixty Years After the First Description: Genome Sequence and Biological Characterization of European Wheat Striate Mosaic Virus Infecting Cereal Crops

Phytopathology ◽

10.1094/phyto-07-19-0258-fi ◽

2020 ◽

Vol 110 (1) ◽

pp. 68-79 ◽

Cited By ~ 4

Author(s):

Merike Sõmera ◽

Anders Kvarnheden ◽

Cécile Desbiez ◽

Dag-Ragnar Blystad ◽

Pille Sooväli ◽

...

Keyword(s):

Mosaic Virus ◽

High Throughput Sequencing ◽

Poa Pratensis ◽

Plant Viruses ◽

Phleum Pratense ◽

Grass Species ◽

Transmission Factor ◽

Range Data ◽

Cereal Species ◽

Systemic Spread

High-throughput sequencing technologies were used to identify plant viruses in cereal samples surveyed from 2012 to 2017. Fifteen genome sequences of a tenuivirus infecting wheat, oats, and spelt in Estonia, Norway, and Sweden were identified and characterized by their distances to other tenuivirus sequences. Like most tenuiviruses, the genome of this tenuivirus contains four genomic segments. The isolates found from different countries shared at least 92% nucleotide sequence identity at the genome level. The planthopper Javesella pellucida was identified as a vector of the virus. Laboratory transmission tests using this vector indicated that wheat, oats, barley, rye, and triticale, but none of the tested pasture grass species (Alopecurus pratensis, Dactylis glomerata, Festuca rubra, Lolium multiflorum, Phleum pratense, and Poa pratensis), are susceptible. Taking into account the vector and host range data, the tenuivirus we have found most probably represents European wheat striate mosaic virus first identified about 60 years ago. Interestingly, whereas we were not able to infect any of the tested cereal species mechanically, Nicotiana benthamiana was infected via mechanical inoculation in laboratory conditions, displaying symptoms of yellow spots and vein clearing evolving into necrosis, eventually leading to plant death. Surprisingly, one of the virus genome segments (RNA2) encoding both a putative host systemic movement enhancer protein and a putative vector transmission factor was not detected in N. benthamiana after several passages even though systemic infection was observed, raising fundamental questions about the role of this segment in the systemic spread in several hosts.

Download Full-text

Host Range Evolution of Potyviruses: A Global Phylogenetic Analysis

Viruses ◽

10.3390/v12010111 ◽

2020 ◽

Vol 12 (1) ◽

pp. 111 ◽

Cited By ~ 2

Author(s):

Benoît Moury ◽

Cécile Desbiez

Keyword(s):

Phylogenetic Analysis ◽

Host Range ◽

Plant Species ◽

Control Strategies ◽

Ecological Factors ◽

Plant Viruses ◽

Range Data ◽

Disease Emergence ◽

Host Range Evolution ◽

Ecological Mechanisms

Virus host range, i.e., the number and diversity of host species of viruses, is an important determinant of disease emergence and of the efficiency of disease control strategies. However, for plant viruses, little is known about the genetic or ecological factors involved in the evolution of host range. Using available genome sequences and host range data, we performed a phylogenetic analysis of host range evolution in the genus Potyvirus, a large group of plant RNA viruses that has undergone a radiative evolution circa 7000 years ago, contemporaneously with agriculture intensification in mid Holocene. Maximum likelihood inference based on a set of 59 potyviruses and 38 plant species showed frequent host range changes during potyvirus evolution, with 4.6 changes per plant species on average, including 3.1 host gains and 1.5 host loss. These changes were quite recent, 74% of them being inferred on the terminal branches of the potyvirus tree. The most striking result was the high frequency of correlated host gains occurring repeatedly in different branches of the potyvirus tree, which raises the question of the dependence of the molecular and/or ecological mechanisms involved in adaptation to different plant species.

Download Full-text

Genome-wide diversity and global migration patterns in dromedaries follow ancient caravan routes

Communications Biology ◽

10.1038/s42003-020-1098-7 ◽

2020 ◽

Vol 3 (1) ◽

Cited By ~ 2

Author(s):

Sara Lado ◽

Jean Pierre Elbers ◽

Angela Doskocil ◽

Davide Scaglione ◽

Emiliano Trucchi ◽

...

Keyword(s):

Demographic History ◽

Genomic Diversity ◽

Phylogeographic Structure ◽

Arid Environments ◽

Migration Patterns ◽

Migration Rates ◽

Global Migration ◽

Genome Wide ◽

A Genome ◽

Scale Population

AbstractDromedaries have been essential for the prosperity of civilizations in arid environments and the dispersal of humans, goods and cultures along ancient, cross-continental trading routes. With increasing desertification their importance as livestock species is rising rapidly, but little is known about their genome-wide diversity and demographic history. As previous studies using few nuclear markers found weak phylogeographic structure, here we detected fine-scale population differentiation in dromedaries across Asia and Africa by adopting a genome-wide approach. Global patterns of effective migration rates revealed pathways of dispersal after domestication, following historic caravan routes like the Silk and Incense Roads. Our results show that a Pleistocene bottleneck and Medieval expansions during the rise of the Ottoman empire have shaped genome-wide diversity in modern dromedaries. By understanding subtle population structure we recognize the value of small, locally adapted populations and appeal for securing genomic diversity for a sustainable utilization of this key desert species.

Download Full-text

Detection and application of genome-wide variations in peach for association and genetic relationship analysis

BMC Genetics ◽

10.1186/s12863-019-0799-8 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 2

Author(s):

Liping Guan ◽

Ke Cao ◽

Yong Li ◽

Jian Guo ◽

Qiang Xu ◽

...

Keyword(s):

Genetic Relationship ◽

Dna Markers ◽

High Throughput Sequencing ◽

Prunus Persica ◽

Genetic Research ◽

Diploid Species ◽

Sequencing Data ◽

Relationship Analysis ◽

Genome Wide ◽

A Genome

Abstract Background Peach (Prunus persica L.) is a diploid species and model plant of the Rosaceae family. In the past decade, significant progress has been made in peach genetic research via DNA markers, but the number of these markers remains limited. Results In this study, we performed a genome-wide DNA markers detection based on sequencing data of six distantly related peach accessions. A total of 650,693~1,053,547 single nucleotide polymorphisms (SNPs), 114,227~178,968 small insertion/deletions (InDels), 8386~12,298 structure variants (SVs), 2111~2581 copy number variants (CNVs) and 229,357~346,940 simple sequence repeats (SSRs) were detected and annotated. To demonstrate the application of DNA markers, 944 SNPs were filtered for association study of fruit ripening time and 15 highly polymorphic SSRs were selected to analyze the genetic relationship among 221 accessions. Conclusions The results showed that the use of high-throughput sequencing to develop DNA markers is fast and effective. Comprehensive identification of DNA markers, including SVs and SSRs, would be of benefit to genetic diversity evaluation, genetic mapping, and molecular breeding of peach.

Download Full-text

A genome-wide identification, characterization and functional analysis of salt-related long non-coding RNAs in non-model plant Pistacia vera L. using transcriptome high throughput sequencing

Scientific Reports ◽

10.1038/s41598-020-62108-6 ◽

2020 ◽

Vol 10 (1) ◽

Cited By ~ 3

Author(s):

Masoomeh Jannesar ◽

Seyed Mahdi Seyedi ◽

Maryam Moazzam Jazi ◽

Vahid Niknam ◽

Hassan Ebrahimzadeh ◽

...

Keyword(s):

Functional Analysis ◽

High Throughput ◽

High Throughput Sequencing ◽

Pistacia Vera ◽

Model Plant ◽

Genome Wide ◽

A Genome ◽

Non Coding Rnas

Download Full-text

Constructing a Reference Genome in a Single Lab: The Possibility to Use Oxford Nanopore Technology

Plants ◽

10.3390/plants8080270 ◽

2019 ◽

Vol 8 (8) ◽

pp. 270 ◽

Cited By ~ 4

Author(s):

Yun Gyeong Lee ◽

Sang Chul Choi ◽

Yuna Kang ◽

Kyeong Min Kim ◽

Chon-Sik Kang ◽

...

Keyword(s):

Plant Species ◽

Genome Sequencing ◽

Reference Genome ◽

Genome Structure ◽

Plant Genome ◽

Sequence Information ◽

Sequencing Analysis ◽

Oxford Nanopore ◽

A Genome ◽

Long Read

The whole genome sequencing (WGS) has become a crucial tool in understanding genome structure and genetic variation. The MinION sequencing of Oxford Nanopore Technologies (ONT) is an excellent approach for performing WGS and it has advantages in comparison with other Next-Generation Sequencing (NGS): It is relatively inexpensive, portable, has simple library preparation, can be monitored in real-time, and has no theoretical limits on reading length. Sorghum bicolor (L.) Moench is diploid (2n = 2x = 20) with a genome size of about 730 Mb, and its genome sequence information is released in the Phytozome database. Therefore, sorghum can be used as a good reference. However, plant species have complex and large genomes when compared to animals or microorganisms. As a result, complete genome sequencing is difficult for plant species. MinION sequencing that produces long-reads can be an excellent tool for overcoming the weak assembly of short-reads generated from NGS by minimizing the generation of gaps or covering the repetitive sequence that appears on the plant genome. Here, we conducted the genome sequencing for S. bicolor cv. BTx623 while using the MinION platform and obtained 895,678 reads and 17.9 gigabytes (Gb) (ca. 25× coverage of reference) from long-read sequence data. A total of 6124 contigs (covering 45.9%) were generated from Canu, and a total of 2661 contigs (covering 50%) were generated from Minimap and Miniasm with a Racon through a de novo assembly using two different tools and mapped assembled contigs against the sorghum reference genome. Our results provide an optimal series of long-read sequencing analysis for plant species while using the MinION platform and a clue to determine the total sequencing scale for optimal coverage that is based on various genome sizes.

Download Full-text