scholarly journals Eutherian comparative genomic analysis protocol

2021 ◽  
Author(s):  
Marko Premzl

Abstract The eutherian genomics momentum greatly advanced biological and medical sciences. Yet, future revisions and updates of eutherian genomic sequence data sets were expected, due to potential genomic sequence errors and incompleteness of genomic sequences. The eutherian comparative genomic analysis protocol was established as guidance in protection against potential genomic sequence errors in public eutherian genomic sequence assemblies. The protocol revised, updated and published 14 major eutherian gene data sets, including 2615 complete coding sequences deposited in European Nucleotide Archive as curated third party data gene data sets under accession numbers: FR734011-FR734074, HF564658-HF564785, HF564786-HF564815, HG328835-HG329089, HG426065-HG426183, HG931734-HG931849, LM644135-LM644234, LN874312-LN874522, LT548096-LT548244, LT631550-LT631670, LT962964-LT963174, LT990249-LT990597, LR130242-LR130508 and LR760818-LR761312.

2020 ◽  
Author(s):  
Marko Premzl

Abstract The eutherian genomics momentum greatly advanced biology and medicine. Nevertheless, future revisions and updates of eutherian genomic sequence data sets were expected, due to potential genomic sequence errors and incompleteness of genomic sequences. The eutherian comparative genomic analysis protocol was established as guidance in protection against potential genomic sequence errors in public eutherian genomic sequence assemblies. The protocol revised, updated and published 12 major eutherian gene data sets, including 1853 complete coding sequences deposited in European Nucleotide Archive as curated third party data gene data sets under accession numbers: FR734011-FR734074, HF564658-HF564785, HF564786-HF564815, HG328835-HG329089, HG426065-HG426183, HG931734-HG931849, LM644135-LM644234, LN874312-LN874522, LT548096-LT548244, LT631550-LT631670, LT962964-LT963174 and LT990249-LT990597.


2019 ◽  
Author(s):  
Marko Premzl

Abstract The eutherian genomics momentum greatly advanced biology and medicine. Nevertheless, future revisions and updates of eutherian genomic sequence data sets were expected, due to potential genomic sequence errors and incompleteness of genomic sequences. The eutherian comparative genomic analysis protocol was established as guidance in protection against potential genomic sequence errors in public eutherian genomic sequence assemblies. The protocol revised, updated and published 11 major eutherian gene data sets, including 1504 complete coding sequences deposited in European Nucleotide Archive as curated third party data gene data sets under accession numbers: FR734011-FR734074, HF564658-HF564785, HF564786-HF564815, HG328835-HG329089, HG426065-HG426183, HG931734-HG931849, LM644135-LM644234, LN874312-LN874522, LT548096-LT548244, LT631550-LT631670 and LT962964-LT963174.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Marko Premzl

AbstractThe eutherian connexins were characterized as protein constituents of gap junctions implicated in cell-cell communications between adjoining cells in multiple cell types, regulation of major physiological processes and disease pathogeneses. However, conventional connexin gene and protein classifications could be regarded as unsuitable in descriptions of comprehensive eutherian connexin gene data sets, due to ambiguities and inconsistencies in connexin gene and protein nomenclatures. Using eutherian comparative genomic analysis protocol and 35 public eutherian reference genomic sequence data sets, the present analysis attempted to update and revise comprehensive eutherian connexin gene data sets, and address and resolve major discrepancies in their descriptions. Among 631 potential coding sequences, the tests of reliability of eutherian public genomic sequences annotated, in aggregate, 349 connexin complete coding sequences. The most comprehensive curated eutherian connexin gene data set described 21 major gene clusters, 4 of which included evidence of differential gene expansions. For example, the present gene annotations initially described human CXNK1 gene and annotated 22 human connexin genes. Phylogenetic tree calculations and calculations of pairwise nucleotide sequence identity patterns proposed revised and updated phylogenetic classification of eutherian connexin genes. Therefore, the present study integrating gene annotations, phylogenetic analysis and protein molecular evolution analysis proposed new nomenclature of eutherian connexin genes and proteins.


2019 ◽  
Author(s):  
Thomas Flouris ◽  
Xiyun Jiao ◽  
Bruce Rannala ◽  
Ziheng Yang

AbstractRecent analyses suggest that cross-species gene flow or introgression is common in nature, especially during species divergences. Genomic sequence data can be used to infer introgression events and to estimate the timing and intensity of introgression, providing an important means to advance our understanding of the role of gene flow in speciation. Here we implement the multispecies-coalescent-with-introgression (MSci) model, an extension of the multispecies-coalescent (MSC) model to incorporate introgression, in our Bayesian Markov chain Monte Carlo (MCMC) program BPP. The MSci model accommodates deep coalescence (or incomplete lineage sorting) and introgression and provides a natural framework for inference using genomic sequence data. Computer simulation confirms the good statistical properties of the method, although hundreds or thousands of loci are typically needed to estimate introgression probabilities reliably. Re-analysis of datasets from the purple cone spruce confirms the hypothesis of homoploid hybrid speciation. We estimated the introgression probability using the genomic sequence data from six mosquito species in the Anopheles gambiae species complex, which varies considerably across the genome, likely driven by differential selection against introgressed alleles.


PeerJ ◽  
2016 ◽  
Vol 4 ◽  
pp. e2806 ◽  
Author(s):  
YaDong Wang ◽  
Christopher Chandler

The bacterial genusRickettsiellabelongs to the order Legionellales in the Gammaproteobacteria, and consists of several described species and pathotypes, most of which are considered to be intracellular pathogens infecting arthropods. Two members of this genus,R. grylliandR. isopodorum, are known to infect terrestrial isopod crustaceans. In this study, we assembled a draft genomic sequence forR. isopodorum, and performed a comparative genomic analysis withR. grylli. We found evidence for several candidate genomic island regions inR. isopodorum, none of which appear in the previously availableR. grylligenome sequence.Furthermore, one of these genomic island candidates inR. isopodorumcontained a gene that encodes a cytotoxin partially homologous to those found inPhotorhabdus luminescensandXenorhabdus nematophilus(Enterobacteriaceae), suggesting that horizontal gene transfer may have played a role in the evolution of pathogenicity inRickettsiella. These results lay the groundwork for future studies on the mechanisms underlying pathogenesis inR. isopodorum, and this system may provide a good model for studying the evolution of host-microbe interactions in nature.


2021 ◽  
Vol 9 (6) ◽  
pp. 1332
Author(s):  
Irene Artuso ◽  
Paolo Turrini ◽  
Mattia Pirolo ◽  
Gabriele Andrea Lugli ◽  
Marco Ventura ◽  
...  

Bacteria belonging to the genus Aminobacter are metabolically versatile organisms thriving in both natural and anthropized terrestrial environments. To date, the taxonomy of this genus is poorly defined due to the unavailability of the genomic sequence of A. anthyllidis LMG 26462T and the presence of unclassified Aminobacter strains. Here, we determined the genome sequence of A. anthyllidis LMG 26462T and performed phylogenomic, average nucleotide identity and digital DNA-DNA hybridization analyses of 17 members of genus Aminobacter. Our results indicate that 16S rRNA-based phylogeny does not provide sufficient species-level discrimination, since most of the unclassified Aminobacter strains belong to valid Aminobacter species or are putative new species. Since some members of the genus Aminobacter can utilize certain C1 compounds, such as methylamines and methyl halides, a comparative genomic analysis was performed to characterize the genetic basis of some degradative/assimilative pathways in the whole genus. Our findings suggest that all Aminobacter species are heterotrophic methylotrophs able to generate the methylene tetrahydrofolate intermediate through multiple oxidative pathways of C1 compounds and convey it in the serine cycle. Moreover, all Aminobacter species carry genes implicated in the degradation of phosphonates via the C-P lyase pathway, whereas only A. anthyllidis LMG 26462T contains a symbiosis island implicated in nodulation and nitrogen fixation.


2018 ◽  
Author(s):  
Huakun Zheng ◽  
Zhenhui Zhong ◽  
Mingyue Shi ◽  
Limei Zhang ◽  
Lianyu Lin ◽  
...  

AbstractBackgroundsPyriculariais a multispecies complex that could infect and cause severe blast disease on diverse hosts, including rice, wheat and many other grasses. Although the genome size of this fungal complex is small [~40 Mbp forPyricularia oryzae(syn.Magnaporthe oryzae), and ~45 Mbp forP. grisea], the genome plasticity allows the fungus to jump and adapt to new hosts. Therefore, deciphering the genome basis of individual species could facilitate the evolutionary and genetic study of this fungus. However, except for theP. oryzaesubgroup, many other species isolated from diverse hosts, such as thePennisetumgrasses, remain largely uncovered genetically.ResultsHere, we report the genome sequence of a pyriform-shaped fungal strainP. pennisetiP1609 isolated from aPennisetumgrass (JUJUNCAO) using PacBio SMRT sequencing technology. We performed a phylogenomic analysis of 28 Magnaporthales species and 5 non-Magnaporthales species and addressed P1609 into aPyriculariasubclade that is distant fromP. oryzae. Comparative genomic analysis revealed that the pathogenicity-related gene repertoires were fairly different between P1609 and theP. oryzaestrain 70-15, including the cloned avirulence genes, other putative secreted proteins, as well as some other predictedPathogen-Host Interaction(PHI) genes. Genomic sequence comparison also identified many genomic rearrangements.ConclusionTaken together, our results suggested that the genomic sequence of theP. pennisetiP1609 could be a useful resource for the genetic study of thePennisetum-infectingPyriculariaspecies.


Sign in / Sign up

Export Citation Format

Share Document