variant prioritization
Recently Published Documents


TOTAL DOCUMENTS

66
(FIVE YEARS 39)

H-INDEX

9
(FIVE YEARS 3)

2021 ◽  
Vol 12 ◽  
Author(s):  
S. A. Durward-Akhurst ◽  
R. J. Schaefer ◽  
B. Grantham ◽  
W. K. Carey ◽  
J. R. Mickelson ◽  
...  

Genetic variation is a key contributor to health and disease. Understanding the link between an individual’s genotype and the corresponding phenotype is a major goal of medical genetics. Whole genome sequencing (WGS) within and across populations enables highly efficient variant discovery and elucidation of the molecular nature of virtually all genetic variation. Here, we report the largest catalog of genetic variation for the horse, a species of importance as a model for human athletic and performance related traits, using WGS of 534 horses. We show the extent of agreement between two commonly used variant callers. In data from ten target breeds that represent major breed clusters in the domestic horse, we demonstrate the distribution of variants, their allele frequencies across breeds, and identify variants that are unique to a single breed. We investigate variants with no homozygotes that may be potential embryonic lethal variants, as well as variants present in all individuals that likely represent regions of the genome with errors, poor annotation or where the reference genome carries a variant. Finally, we show regions of the genome that have higher or lower levels of genetic variation compared to the genome average. This catalog can be used for variant prioritization for important equine diseases and traits, and to provide key information about regions of the genome where the assembly and/or annotation need to be improved.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Andreas Ruscheinski ◽  
Anna Lena Reimler ◽  
Roland Ewald ◽  
Adelinde M. Uhrmacher

Abstract Background Clinical diagnostics of whole-exome and whole-genome sequencing data requires geneticists to consider thousands of genetic variants for each patient. Various variant prioritization methods have been developed over the last years to aid clinicians in identifying variants that are likely disease-causing. Each time a new method is developed, its effectiveness must be evaluated and compared to other approaches based on the most recently available evaluation data. Doing so in an unbiased, systematic, and replicable manner requires significant effort. Results The open-source test bench “VPMBench” automates the evaluation of variant prioritization methods. VPMBench introduces a standardized interface for prioritization methods and provides a plugin system that makes it easy to evaluate new methods. It supports different input data formats and custom output data preparation. VPMBench exploits declaratively specified information about the methods, e.g., the variants supported by the methods. Plugins may also be provided in a technology-agnostic manner via containerization. Conclusions VPMBench significantly simplifies the evaluation of both custom and published variant prioritization methods. As we expect variant prioritization methods to become ever more critical with the advent of whole-genome sequencing in clinical diagnostics, such tool support is crucial to facilitate methodological research.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
So Young Kim ◽  
Seungmin Lee ◽  
Go Hun Seo ◽  
Bong Jik Kim ◽  
Doo Yi Oh ◽  
...  

AbstractVariant prioritization of exome sequencing (ES) data for molecular diagnosis of sensorineural hearing loss (SNHL) with extreme etiologic heterogeneity poses a significant challenge. This study used an automated variant prioritization system (“EVIDENCE”) to analyze SNHL patient data and assess its diagnostic accuracy. We performed ES of 263 probands manifesting mild to moderate or higher degrees of SNHL. Candidate variants were classified according to the 2015 American College of Medical Genetics guidelines, and we compared the accuracy, call rates, and efficiency of variant prioritizations performed manually by humans or using EVIDENCE. In our in silico panel, 21 synthetic cases were successfully analyzed by EVIDENCE. In our cohort, the ES diagnostic yield for SNHL by manual analysis was 50.19% (132/263) and 50.95% (134/263) by EVIDENCE. EVIDENCE processed ES data 24-fold faster than humans, and the concordant call rate between humans and EVIDENCE was 97.72% (257/263). Additionally, EVIDENCE outperformed human accuracy, especially at discovering causative variants of rare syndromic deafness, whereas flexible interpretations that required predefined specific genotype–phenotype correlations were possible only by manual prioritization. The automated variant prioritization system remarkably facilitated the molecular diagnosis of hearing loss with high accuracy and efficiency, fostering the popularization of molecular genetic diagnosis of SNHL.


2021 ◽  
Author(s):  
Sara Victoria Good ◽  
Ryan Gotesman ◽  
Ilya Kisselev ◽  
Andrew D. Paterson

Abstract GWAS have identified thousands of loci associated with human complex diseases and traits. How these loci are distributed through the genome has not been systematically evaluated. We hypothesised that the location of GWAS loci differ between ancestral linkage groups (ALGs) related to the paralogy and function of genes. We used data from the NHGRI-EBI GWAS catalog to determine whether the density of GWAS loci relative to HapMap variants in each ALG differed, and whether ALG’s were enriched for experimental factor ontological (EFO) terms assigned to the GWAS traits. In a gene-level analyses we explored the characteristics of genes linked to GWAS loci and those mapping to the ALG’s. We find that GWAS loci were enriched or deficient in 9 and 7 of the 17 ALG’s respectively, while there was no difference in the number of GWAS loci in regions of the human genome unassigned to an ALG. All but 2 ALG’s were significantly enriched or deficient for one or more EFO terms. Lastly, we find that genes assigned to an ALG are under higher levels of selective constraint, have longer coding sequences and higher median expression in the tissue of highest expression than genes not mapping to an ALG. On the other hand, genes associated with GWAS loci have longer genomic length and exhibit higher levels of selective constraint relative to non-GWAS genes.Collectively, this suggests that understanding the location and ancestral origins of GWAS signals may be informative for the development of tools for variant prioritization and interpretation.


2021 ◽  
Author(s):  
Meng Yang ◽  
Haiping Huang ◽  
Lichao Huang ◽  
Nan Zhang ◽  
Jihong Wu ◽  
...  

Interpretation of non-coding genome remains an unsolved challenge in human genetics due to impracticality of exhaustively annotate biochemically active elements in all conditions. Deep learning based computational approaches emerge recently to help interpretating non-coding regions. Here we present LOGO (Language of Genome), a self-attention based contextualized pre-trained language model containing only 2 self-attention layers with 1 million parameters as a substantially light architecture that applies self-supervision techniques to learn bidirectional representations of unlabeled human reference genome. LOGO is then fine-tuned for sequence labelling task, and further extended to variant prioritization task via a special input encoding scheme of alternative alleles followed by adding a convolutional module. Experiments show that LOGO achieves 15% absolute improvement for promoter identification and up to 4.5% absolute improvement for enhancer-promoter interaction prediction. LOGO exhibits state-of-the-art multi-task predictive power on thousands of chromatin features with only 3% parameterization benchmarking against fully supervised model, DeepSEA and 1% parameterization against a recent BERT-based language model for human genome. For allelic-effect prediction, locality introduced by one dimensional convolution shows improved sensitivity and specificity for prioritizing non-coding variants associated with human diseases. In addition, we apply LOGO to interpret type 2 diabetes (T2D) GWAS signals and infer underlying regulatory mechanisms. We make a conceptual analogy between natural language and human genome and demonstrate LOGO is an accurate, fast, scalable, and robust framework to interpret non-coding regions for global sequence labeling as well as for variant prioritization at base-resolution.


Author(s):  
Maliheh Najari Beidokhti ◽  
Alexander C. Bertalovitz ◽  
Weizhen Ji ◽  
Jorge McCormack ◽  
Lauren Jeffries ◽  
...  

2021 ◽  
Vol 132 ◽  
pp. S261
Author(s):  
Ana S.A. Cohen ◽  
Isabelle Thiffault ◽  
Emily Farrow ◽  
Warren Cheung ◽  
Jeffrey Johnston ◽  
...  

2021 ◽  
Vol 132 ◽  
pp. S82-S83
Author(s):  
Go Hun Seo ◽  
So Young Kim ◽  
Bong Jik Kim ◽  
Doo Yi Oh ◽  
Jin Hee Han ◽  
...  

2021 ◽  
Author(s):  
Neethukrishna Kausthubham ◽  
Anju Shukla ◽  
Neerja Gupta ◽  
Gandham SriLakshmi Bhavani ◽  
Samarth Kulshrestha ◽  
...  

Author(s):  
Matteo Chiara ◽  
Pietro Mandreoli ◽  
Marco Antonio Tangaro ◽  
Anna Maria D’Erchia ◽  
Sandro Sorrentino ◽  
...  

Abstract Motivation Clinical applications of genome re-sequencing technologies typically generate large amounts of data that need to be carefully annotated and interpreted to identify genetic variants potentially associated with pathological conditions. In this context, accurate and reproducible methods for the functional annotation and prioritization of genetic variants are of fundamental importance. Results In this paper, we present VINYL, a flexible and fully automated system for the functional annotation and prioritization of genetic variants. Extensive analyses of both real and simulated datasets suggest that VINYL can identify clinically relevant genetic variants in a more accurate manner compared to equivalent state of the art methods, allowing a more rapid and effective prioritization of genetic variants in different experimental settings. As such we believe that VINYL can establish itself as a valuable tool to assist healthcare operators and researchers in clinical genomics investigations. Availability VINYL is available at http://beaconlab.it/VINYL and https://github.com/matteo14c/VINYL. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document