scholarly journals No ‘small genome attraction’ artifact: A response to Harish et al. ‘Did viruses evolve as a distinct supergroup from common ancestors of cells?’

2016 ◽  
Author(s):  
Arshan Nasir ◽  
Kyung Mo Kim ◽  
Gustavo Caetano-Anollés

In a recent eLetter and associated preprint, Harish, Abroi, Gough and Kurland criticized our structural phylogenomic methods, which support the early cellular origin of viruses. Their claims include the argument that the rooting of our trees is artifactual and distorted by small genome (proteome) size. Here we uncover their aprioristic reasoning, which mingles with misunderstandings and misinterpretations of cladistic methodology. To demonstrate, we labeled the phylogenetic positions of the smallest proteomes in our phylogenetic trees and confirm that the smallest genomes were neither attracted towards the root nor caused any distortions in the four-supergroup tree of life. Their results therefore stem from confusing outgroups with ancestors and handpicking problematic taxa to distort tree reconstruction. In doing so, they ignored the details of our rooting method, taxa sampling rationale, the plethora of evidence given in our study supporting the ancient origin of the viral supergroup and also recent literature on viral evolution. Indeed, our tree of life uncovered many viral monophyletic groups consistent with ICTV classifications and showed remarkable evolutionary tracings of virion morphotypes onto a revealing tree topology.

2021 ◽  
Vol 7 (12) ◽  
pp. eabe2741
Author(s):  
Paschalia Kapli ◽  
Paschalis Natsidis ◽  
Daniel J. Leite ◽  
Maximilian Fursman ◽  
Nadia Jeffrie ◽  
...  

The bilaterally symmetric animals (Bilateria) are considered to comprise two monophyletic groups, Protostomia (Ecdysozoa and the Lophotrochozoa) and Deuterostomia (Chordata and the Xenambulacraria). Recent molecular phylogenetic studies have not consistently supported deuterostome monophyly. Here, we compare support for Protostomia and Deuterostomia using multiple, independent phylogenomic datasets. As expected, Protostomia is always strongly supported, especially by longer and higher-quality genes. Support for Deuterostomia, however, is always equivocal and barely higher than support for paraphyletic alternatives. Conditions that cause tree reconstruction errors—inadequate models, short internal branches, faster evolving genes, and unequal branch lengths—coincide with support for monophyletic deuterostomes. Simulation experiments show that support for Deuterostomia could be explained by systematic error. The branch between bilaterian and deuterostome common ancestors is, at best, very short, supporting the idea that the bilaterian ancestor may have been deuterostome-like. Our findings have important implications for the understanding of early animal evolution.


PLoS ONE ◽  
2020 ◽  
Vol 15 (12) ◽  
pp. e0240953
Author(s):  
Christian Schulz ◽  
Eivind Almaas

Approaches for systematizing information of relatedness between organisms is important in biology. Phylogenetic analyses based on sets of highly conserved genes are currently the basis for the Tree of Life. Genome-scale metabolic reconstructions contain high-quality information regarding the metabolic capability of an organism and are typically restricted to metabolically active enzyme-encoding genes. While there are many tools available to generate draft reconstructions, expert-level knowledge is still required to generate and manually curate high-quality genome-scale metabolic models and to fill gaps in their reaction networks. Here, we use the tool AutoKEGGRec to construct 975 genome-scale metabolic draft reconstructions encoded in the KEGG database without further curation. The organisms are selected across all three domains, and their metabolic networks serve as basis for generating phylogenetic trees. We find that using all reactions encoded, these metabolism-based comparisons give rise to a phylogenetic tree with close similarity to the Tree of Life. While this tree is quite robust to reasonable levels of noise in the metabolic reaction content of an organism, we find a significant heterogeneity in how much noise an organism may tolerate before it is incorrectly placed in the tree. Furthermore, by using the protein sequences for particular metabolic functions and pathway sets, such as central carbon-, nitrogen-, and sulfur-metabolism, as basis for the organism comparisons, we generate highly specific phylogenetic trees. We believe the generation of phylogenetic trees based on metabolic reaction content, in particular when focused on specific functions and pathways, could aid the identification of functionally important metabolic enzymes and be of value for genome-scale metabolic modellers and enzyme-engineers.


2022 ◽  
Vol 22 (1) ◽  
Author(s):  
Monique Aouad ◽  
Jean-Pierre Flandrois ◽  
Frédéric Jauffrit ◽  
Manolo Gouy ◽  
Simonetta Gribaldo ◽  
...  

Abstract Background The recent rise in cultivation-independent genome sequencing has provided key material to explore uncharted branches of the Tree of Life. This has been particularly spectacular concerning the Archaea, projecting them at the center stage as prominently relevant to understand early stages in evolution and the emergence of fundamental metabolisms as well as the origin of eukaryotes. Yet, resolving deep divergences remains a challenging task due to well-known tree-reconstruction artefacts and biases in extracting robust ancient phylogenetic signal, notably when analyzing data sets including the three Domains of Life. Among the various strategies aimed at mitigating these problems, divide-and-conquer approaches remain poorly explored, and have been primarily based on reconciliation among single gene trees which however notoriously lack ancient phylogenetic signal. Results We analyzed sub-sets of full supermatrices covering the whole Tree of Life with specific taxonomic sampling to robustly resolve different parts of the archaeal phylogeny in light of their current diversity. Our results strongly support the existence and early emergence of two main clades, Cluster I and Cluster II, which we name Ouranosarchaea and Gaiarchaea, and we clarify the placement of important novel archaeal lineages within these two clades. However, the monophyly and branching of the fast evolving nanosized DPANN members remains unclear and worth of further study. Conclusions We inferred a well resolved rooted phylogeny of the Archaea that includes all recently described phyla of high taxonomic rank. This phylogeny represents a valuable reference to study the evolutionary events associated to the early steps of the diversification of the archaeal domain. Beyond the specifics of archaeal phylogeny, our results demonstrate the power of divide-and-conquer approaches to resolve deep phylogenetic relationships, which should be applied to progressively resolve the entire Tree of Life.


2018 ◽  
Author(s):  
Stephen T. Pollard ◽  
Kenji Fukushima ◽  
Zhengyuan O. Wang ◽  
Todd A. Castoe ◽  
David D. Pollock

ABSTRACTPhylogenetic inference requires a means to search phylogenetic tree space. This is usually achieved using progressive algorithms that propose and test small alterations in the current tree topology and branch lengths. Current programs search tree topology space using branch-swapping algorithms, but proposals do not discriminate well between swaps likely to succeed or fail. When applied to datasets with many taxa, the huge number of possible topologies slows these programs dramatically. To overcome this, we developed a statistical approach for proposal generation in Bayesian analysis, and evaluated its applicability for the problem of searching phylogenetic tree space. The general idea of the approach, which we call ‘Markov katana’, is to make proposals based on a heuristic algorithm using bootstrapped subsets of the data. Such proposals induce an unintended sampling distribution that must be determined and removed to generate posterior estimates, but the cost of this extra step can in principle be small compared to the added value of more efficient parameter exploration in Markov chain Monte Carlo analyses. Our prototype application uses the simple neighbor-joining distance heuristic on data subsets to propose new reasonably likely phylogenetic trees (including topologies and branch lengths). The evolutionary model used to generate distances in our prototype was far simpler than the more complex model used to evaluate the likelihood of phylogenies based on the full dataset. This prototype implementation indicates that the Markov katana approach could be easily incorporated into existing phylogenetic search programs and may prove a useful alternative in conjunction with existing methods. The general features of this statistical approach may also prove useful in disciplines other than phylogenetics. We demonstrate that this method can be used to efficiently estimate a Bayesian posterior.


Viruses ◽  
2019 ◽  
Vol 11 (5) ◽  
pp. 418 ◽  
Author(s):  
Akhtar Ali ◽  
Ulrich Melcher

Diverse studies of viral evolution have led to the recognition that the evolutionary rates of viral taxa observed are dependent on the time scale being investigated—with short-term studies giving fast substitution rates, and orders of magnitude lower rates for deep calibrations. Although each of these factors may contribute to this time dependent rate phenomenon, a more fundamental cause should be considered. We sought to test computationally whether the basic phenomena of virus evolution (mutation, replication, and selection) can explain the relationships between the evolutionary and phylogenetic distances. We tested, by computational inference, the hypothesis that the phylogenetic distances between the pairs of sequences are functions of the evolutionary path lengths between them. A Basic simulation revealed that the relationship between simulated genetic and mutational distances is non-linear, and can be consistent with different rates of nucleotide substitution at different depths of branches in phylogenetic trees.


2021 ◽  
Author(s):  
Ashley E Nazario-Toole ◽  
Hui Xia ◽  
Thomas F Gibbons

ABSTRACT Introduction The outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has created a global pandemic resulting in over 1 million deaths worldwide. In the Department of Defense (DoD), over 129,000 personnel (civilians, dependents, and active duty) have been infected with the virus to date. Rapid estimations of transmission and mutational patterns of virus outbreaks can be accomplished using whole-genome viral sequencing. Deriving interpretable and actionable results from pathogen sequence data is accomplished by the construction of phylogenetic trees (from local and global virus sequences) and by the creation of protein maps, to visualize and predict the effects of structural protein amino acid mutations. Materials and Methods We developed a sequencing and bioinformatics workflow for molecular epidemiological SARS-CoV-2 surveillance using excess clinical specimens collected under an institutional review board exempt protocol at Joint Base San Antonio, Lackland AFB. This workflow includes viral RNA isolation, viral load quantification, tiling-based next-generation sequencing, sequencing and bioinformatics analysis, and data visualization via phylogenetic trees and protein mapping. Results Sequencing of 37 clinical specimens collected at JBSA/Lackland revealed that by June 2020, SAR-CoV-2 strains carrying the 614G mutation were the predominant cause of local coronavirus disease 2019 infections. We identified 109 nucleotide changes in the coding region of the SARS-CoV-2 genome (which lead to 63 unique, non-synonymous amino acid mutations), one mutation in the 5ʹ-untranslated region (UTR), and two mutations in the 3ʹUTR. Furthermore, we identified and mapped six additional spike protein amino acid changes—information which could potentially aid vaccine design. Conclusion The workflow presented here is designed to enable DoD public health officials to track viral evolution and conduct near real-time evaluation of future outbreaks. The generation of molecular epidemiological sequence data is critical for the development of disease intervention strategies—most notably, vaccine design. Overall, we present a streamlined sequencing and bioinformatics methodology aimed at improving long-term readiness efforts in the DoD.


2016 ◽  
Vol 113 (10) ◽  
pp. 2690-2695 ◽  
Author(s):  
Ethan O. Romero-Severson ◽  
Ingo Bulla ◽  
Thomas Leitner

Although the use of phylogenetic trees in epidemiological investigations has become commonplace, their epidemiological interpretation has not been systematically evaluated. Here, we use an HIV-1 within-host coalescent model to probabilistically evaluate transmission histories of two epidemiologically linked hosts. Previous critique of phylogenetic reconstruction has claimed that direction of transmission is difficult to infer, and that the existence of unsampled intermediary links or common sources can never be excluded. The phylogenetic relationship between the HIV populations of epidemiologically linked hosts can be classified into six types of trees, based on cladistic relationships and whether the reconstruction is consistent with the true transmission history or not. We show that the direction of transmission and whether unsampled intermediary links or common sources existed make very different predictions about expected phylogenetic relationships: (i) Direction of transmission can often be established when paraphyly exists, (ii) intermediary links can be excluded when multiple lineages were transmitted, and (iii) when the sampled individuals’ HIV populations both are monophyletic a common source was likely the origin. Inconsistent results, suggesting the wrong transmission direction, were generally rare. In addition, the expected tree topology also depends on the number of transmitted lineages, the sample size, the time of the sample relative to transmission, and how fast the diversity increases after infection. Typically, 20 or more sequences per subject give robust results. We confirm our theoretical evaluations with analyses of real transmission histories and discuss how our findings should aid in interpreting phylogenetic results.


1993 ◽  
Vol 6 (5) ◽  
pp. 441 ◽  
Author(s):  
PG Martin ◽  
JM Dowd

Sequences of rbcL for 23 species of Nothofagus and three of Fagus have been determined and analysed to form phylogenetic trees. The two genera are well separated. The species of Nothofagus separate into lineages which correspond exactly with the subgenera recently defined on morphological grounds. The rate of evolution of the four subgenera is shown to be statistically the same and, using a reference date from palaeobotany, is found to be one nucleotide change in 6 Ma. This rate is used to derive the ages of the common ancestors of species in subgenera and it is tentatively concluded that intercontinental dispersal was possible in the early stages of the evolution of the genus.


2020 ◽  
Author(s):  
Emily Jane McTavish ◽  
Luna L Sanchez Reyes ◽  
Mark T. Holder

The Open Tree of Life project constructs a comprehensive, dynamic and digitally-available tree of life by synthesizing published phylogenetic trees along with taxonomic data. Open Tree of Life provides web-service application programming interfaces (APIs) to make the tree estimate, unified taxonomy, and input phylogenetic data available to anyone. Here, we describe the python package 'opentree', which provides a user friendly python wrapper for these APIs and a set of scripts and tutorials for straightforward downstream data analyses. We demonstrate the utility of these tools by generating an estimate of the phylogenetic relationships of all bird families, and by capturing a phylogenetic estimate for all taxa observed at the University of California Merced Vernal Pools and Grassland Reserve.


Sign in / Sign up

Export Citation Format

Share Document