Eutherian comparative genomic analysis protocol

Mapping Intimacies ◽

10.21203/rs.2.1502/v2 ◽

2019 ◽

Author(s):

Marko Premzl

Keyword(s):

Genomic Sequence ◽

Sequence Data ◽

Genomic Analysis ◽

Comparative Genomic Analysis ◽

Third Party ◽

Comparative Genomic ◽

Data Sets ◽

Gene Data ◽

Sequence Errors ◽

Analysis Protocol

Abstract The eutherian genomics momentum greatly advanced biology and medicine. Nevertheless, future revisions and updates of eutherian genomic sequence data sets were expected, due to potential genomic sequence errors and incompleteness of genomic sequences. The eutherian comparative genomic analysis protocol was established as guidance in protection against potential genomic sequence errors in public eutherian genomic sequence assemblies. The protocol revised, updated and published 11 major eutherian gene data sets, including 1504 complete coding sequences deposited in European Nucleotide Archive as curated third party data gene data sets under accession numbers: FR734011-FR734074, HF564658-HF564785, HF564786-HF564815, HG328835-HG329089, HG426065-HG426183, HG931734-HG931849, LM644135-LM644234, LN874312-LN874522, LT548096-LT548244, LT631550-LT631670 and LT962964-LT963174.

Download Full-text

Eutherian comparative genomic analysis protocol

10.21203/rs.2.1502/v3 ◽

2020 ◽

Author(s):

Marko Premzl

Keyword(s):

Genomic Sequence ◽

Sequence Data ◽

Genomic Analysis ◽

Comparative Genomic Analysis ◽

Third Party ◽

Comparative Genomic ◽

Data Sets ◽

Gene Data ◽

Sequence Errors ◽

Analysis Protocol

Abstract The eutherian genomics momentum greatly advanced biology and medicine. Nevertheless, future revisions and updates of eutherian genomic sequence data sets were expected, due to potential genomic sequence errors and incompleteness of genomic sequences. The eutherian comparative genomic analysis protocol was established as guidance in protection against potential genomic sequence errors in public eutherian genomic sequence assemblies. The protocol revised, updated and published 12 major eutherian gene data sets, including 1853 complete coding sequences deposited in European Nucleotide Archive as curated third party data gene data sets under accession numbers: FR734011-FR734074, HF564658-HF564785, HF564786-HF564815, HG328835-HG329089, HG426065-HG426183, HG931734-HG931849, LM644135-LM644234, LN874312-LN874522, LT548096-LT548244, LT631550-LT631670, LT962964-LT963174 and LT990249-LT990597.

Download Full-text

Eutherian comparative genomic analysis protocol

10.21203/rs.2.1502/v4 ◽

2021 ◽

Author(s):

Marko Premzl

Keyword(s):

Genomic Sequence ◽

Sequence Data ◽

Genomic Analysis ◽

Comparative Genomic Analysis ◽

Third Party ◽

Comparative Genomic ◽

Data Sets ◽

Gene Data ◽

Sequence Errors ◽

Analysis Protocol

Abstract The eutherian genomics momentum greatly advanced biological and medical sciences. Yet, future revisions and updates of eutherian genomic sequence data sets were expected, due to potential genomic sequence errors and incompleteness of genomic sequences. The eutherian comparative genomic analysis protocol was established as guidance in protection against potential genomic sequence errors in public eutherian genomic sequence assemblies. The protocol revised, updated and published 14 major eutherian gene data sets, including 2615 complete coding sequences deposited in European Nucleotide Archive as curated third party data gene data sets under accession numbers: FR734011-FR734074, HF564658-HF564785, HF564786-HF564815, HG328835-HG329089, HG426065-HG426183, HG931734-HG931849, LM644135-LM644234, LN874312-LN874522, LT548096-LT548244, LT631550-LT631670, LT962964-LT963174, LT990249-LT990597, LR130242-LR130508 and LR760818-LR761312.

Download Full-text

Comparative genomic analysis of eutherian connexin genes

Scientific Reports ◽

10.1038/s41598-019-53458-x ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 2

Author(s):

Marko Premzl

Keyword(s):

Genomic Analysis ◽

Comparative Genomic Analysis ◽

Comparative Genomic ◽

Data Sets ◽

Data Set ◽

Coding Sequences ◽

Gene Annotations ◽

Connexin Gene ◽

Connexin Genes ◽

Gene Data

AbstractThe eutherian connexins were characterized as protein constituents of gap junctions implicated in cell-cell communications between adjoining cells in multiple cell types, regulation of major physiological processes and disease pathogeneses. However, conventional connexin gene and protein classifications could be regarded as unsuitable in descriptions of comprehensive eutherian connexin gene data sets, due to ambiguities and inconsistencies in connexin gene and protein nomenclatures. Using eutherian comparative genomic analysis protocol and 35 public eutherian reference genomic sequence data sets, the present analysis attempted to update and revise comprehensive eutherian connexin gene data sets, and address and resolve major discrepancies in their descriptions. Among 631 potential coding sequences, the tests of reliability of eutherian public genomic sequences annotated, in aggregate, 349 connexin complete coding sequences. The most comprehensive curated eutherian connexin gene data set described 21 major gene clusters, 4 of which included evidence of differential gene expansions. For example, the present gene annotations initially described human CXNK1 gene and annotated 22 human connexin genes. Phylogenetic tree calculations and calculations of pairwise nucleotide sequence identity patterns proposed revised and updated phylogenetic classification of eutherian connexin genes. Therefore, the present study integrating gene annotations, phylogenetic analysis and protein molecular evolution analysis proposed new nomenclature of eutherian connexin genes and proteins.

Download Full-text

A Bayesian implementation of the multispecies coalescent model with introgression for comparative genomic analysis

10.1101/766741 ◽

2019 ◽

Cited By ~ 1

Author(s):

Thomas Flouris ◽

Xiyun Jiao ◽

Bruce Rannala ◽

Ziheng Yang

Keyword(s):

Gene Flow ◽

Genomic Sequence ◽

Sequence Data ◽

Incomplete Lineage Sorting ◽

Mosquito Species ◽

Genomic Analysis ◽

Comparative Genomic ◽

Lineage Sorting ◽

Multispecies Coalescent ◽

Important Means

AbstractRecent analyses suggest that cross-species gene flow or introgression is common in nature, especially during species divergences. Genomic sequence data can be used to infer introgression events and to estimate the timing and intensity of introgression, providing an important means to advance our understanding of the role of gene flow in speciation. Here we implement the multispecies-coalescent-with-introgression (MSci) model, an extension of the multispecies-coalescent (MSC) model to incorporate introgression, in our Bayesian Markov chain Monte Carlo (MCMC) program BPP. The MSci model accommodates deep coalescence (or incomplete lineage sorting) and introgression and provides a natural framework for inference using genomic sequence data. Computer simulation confirms the good statistical properties of the method, although hundreds or thousands of loci are typically needed to estimate introgression probabilities reliably. Re-analysis of datasets from the purple cone spruce confirms the hypothesis of homoploid hybrid speciation. We estimated the introgression probability using the genomic sequence data from six mosquito species in the Anopheles gambiae species complex, which varies considerably across the genome, likely driven by differential selection against introgressed alleles.

Download Full-text

Candidate pathogenicity islands in the genome of ‘CandidatusRickettsiella isopodorum’, an intracellular bacterium infecting terrestrial isopod crustaceans

PeerJ ◽

10.7717/peerj.2806 ◽

2016 ◽

Vol 4 ◽

pp. e2806 ◽

Cited By ~ 9

Author(s):

YaDong Wang ◽

Christopher Chandler

Keyword(s):

Genomic Island ◽

Genomic Sequence ◽

Genomic Analysis ◽

Comparative Genomic Analysis ◽

Comparative Genomic ◽

Photorhabdus Luminescens ◽

Future Studies ◽

Terrestrial Isopod ◽

Microbe Interactions ◽

Host Microbe Interactions

The bacterial genusRickettsiellabelongs to the order Legionellales in the Gammaproteobacteria, and consists of several described species and pathotypes, most of which are considered to be intracellular pathogens infecting arthropods. Two members of this genus,R. grylliandR. isopodorum, are known to infect terrestrial isopod crustaceans. In this study, we assembled a draft genomic sequence forR. isopodorum, and performed a comparative genomic analysis withR. grylli. We found evidence for several candidate genomic island regions inR. isopodorum, none of which appear in the previously availableR. grylligenome sequence.Furthermore, one of these genomic island candidates inR. isopodorumcontained a gene that encodes a cytotoxin partially homologous to those found inPhotorhabdus luminescensandXenorhabdus nematophilus(Enterobacteriaceae), suggesting that horizontal gene transfer may have played a role in the evolution of pathogenicity inRickettsiella. These results lay the groundwork for future studies on the mechanisms underlying pathogenesis inR. isopodorum, and this system may provide a good model for studying the evolution of host-microbe interactions in nature.

Download Full-text

Eutherian comparative genomic analysis protocol

Protocol Exchange ◽

10.1038/protex.2018.028 ◽

2018 ◽

Cited By ~ 1

Author(s):

Marko Premzl ◽

Marko Premzl

Keyword(s):

Genomic Analysis ◽

Comparative Genomic Analysis ◽

Comparative Genomic ◽

Analysis Protocol

Download Full-text

Phylogenomic Reconstruction and Metabolic Potential of the Genus Aminobacter

Microorganisms ◽

10.3390/microorganisms9061332 ◽

2021 ◽

Vol 9 (6) ◽

pp. 1332

Author(s):

Irene Artuso ◽

Paolo Turrini ◽

Mattia Pirolo ◽

Gabriele Andrea Lugli ◽

Marco Ventura ◽

...

Keyword(s):

Dna Hybridization ◽

Genetic Basis ◽

Genomic Sequence ◽

Genomic Analysis ◽

Comparative Genomic Analysis ◽

Species Level ◽

Comparative Genomic ◽

Metabolic Potential ◽

Methyl Halides ◽

Terrestrial Environments

Bacteria belonging to the genus Aminobacter are metabolically versatile organisms thriving in both natural and anthropized terrestrial environments. To date, the taxonomy of this genus is poorly defined due to the unavailability of the genomic sequence of A. anthyllidis LMG 26462T and the presence of unclassified Aminobacter strains. Here, we determined the genome sequence of A. anthyllidis LMG 26462T and performed phylogenomic, average nucleotide identity and digital DNA-DNA hybridization analyses of 17 members of genus Aminobacter. Our results indicate that 16S rRNA-based phylogeny does not provide sufficient species-level discrimination, since most of the unclassified Aminobacter strains belong to valid Aminobacter species or are putative new species. Since some members of the genus Aminobacter can utilize certain C1 compounds, such as methylamines and methyl halides, a comparative genomic analysis was performed to characterize the genetic basis of some degradative/assimilative pathways in the whole genus. Our findings suggest that all Aminobacter species are heterotrophic methylotrophs able to generate the methylene tetrahydrofolate intermediate through multiple oxidative pathways of C1 compounds and convey it in the serine cycle. Moreover, all Aminobacter species carry genes implicated in the degradation of phosphonates via the C-P lyase pathway, whereas only A. anthyllidis LMG 26462T contains a symbiosis island implicated in nodulation and nitrogen fixation.

Download Full-text

Comparative genomic analysis revealed rapid differentiation in the pathogenicity-related gene repertoires betweenPyricularia oryzaeandPyricularia pennisetiisolated from aPennisetumgrass

10.1101/360016 ◽

2018 ◽

Author(s):

Huakun Zheng ◽

Zhenhui Zhong ◽

Mingyue Shi ◽

Limei Zhang ◽

Lianyu Lin ◽

...

Keyword(s):

Genetic Study ◽

Genomic Sequence ◽

Genomic Analysis ◽

Comparative Genomic Analysis ◽

Blast Disease ◽

Individual Species ◽

Comparative Genomic ◽

Phylogenomic Analysis ◽

Avirulence Genes ◽

Related Gene

AbstractBackgroundsPyriculariais a multispecies complex that could infect and cause severe blast disease on diverse hosts, including rice, wheat and many other grasses. Although the genome size of this fungal complex is small [~40 Mbp forPyricularia oryzae(syn.Magnaporthe oryzae), and ~45 Mbp forP. grisea], the genome plasticity allows the fungus to jump and adapt to new hosts. Therefore, deciphering the genome basis of individual species could facilitate the evolutionary and genetic study of this fungus. However, except for theP. oryzaesubgroup, many other species isolated from diverse hosts, such as thePennisetumgrasses, remain largely uncovered genetically.ResultsHere, we report the genome sequence of a pyriform-shaped fungal strainP. pennisetiP1609 isolated from aPennisetumgrass (JUJUNCAO) using PacBio SMRT sequencing technology. We performed a phylogenomic analysis of 28 Magnaporthales species and 5 non-Magnaporthales species and addressed P1609 into aPyriculariasubclade that is distant fromP. oryzae. Comparative genomic analysis revealed that the pathogenicity-related gene repertoires were fairly different between P1609 and theP. oryzaestrain 70-15, including the cloned avirulence genes, other putative secreted proteins, as well as some other predictedPathogen-Host Interaction(PHI) genes. Genomic sequence comparison also identified many genomic rearrangements.ConclusionTaken together, our results suggested that the genomic sequence of theP. pennisetiP1609 could be a useful resource for the genetic study of thePennisetum-infectingPyriculariaspecies.

Download Full-text