Ehapp2: Estimate haplotype frequencies from pooled sequencing data with prior database information

2016 ◽  
Vol 14 (04) ◽  
pp. 1650017
Author(s):  
Chang-Chang Cao ◽  
Xiao Sun

To reduce the cost of large-scale re-sequencing, multiple individuals are pooled together and sequenced called pooled sequencing. Pooled sequencing could provide a cost-effective alternative to sequencing individuals separately. To facilitate the application of pooled sequencing in haplotype-based diseases association analysis, the critical procedure is to accurately estimate haplotype frequencies from pooled samples. Here we present Ehapp2 for estimating haplotype frequencies from pooled sequencing data by utilizing a database which provides prior information of known haplotypes. We first translate the problem of estimating frequency for each haplotype into finding a sparse solution for a system of linear equations, where the NNREG algorithm is employed to achieve the solution. Simulation experiments reveal that Ehapp2 is robust to sequencing errors and able to estimate the frequencies of haplotypes with less than 3% average relative difference for pooled sequencing of mixture of real Drosophila haplotypes with 50× total coverage even when the sequencing error rate is as high as 0.05. Owing to the strategy that proportions for local haplotypes spanning multiple SNPs are accurately calculated first, Ehapp2 retains excellent estimation for recombinant haplotypes resulting from chromosomal crossover. Comparisons with present methods reveal that Ehapp2 is state-of-the-art for many sequencing study designs and more suitable for current massive parallel sequencing.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Yasemin Guenay-Greunke ◽  
David A. Bohan ◽  
Michael Traugott ◽  
Corinna Wallinger

AbstractHigh-throughput sequencing platforms are increasingly being used for targeted amplicon sequencing because they enable cost-effective sequencing of large sample sets. For meaningful interpretation of targeted amplicon sequencing data and comparison between studies, it is critical that bioinformatic analyses do not introduce artefacts and rely on detailed protocols to ensure that all methods are properly performed and documented. The analysis of large sample sets and the use of predefined indexes create challenges, such as adjusting the sequencing depth across samples and taking sequencing errors or index hopping into account. However, the potential biases these factors introduce to high-throughput amplicon sequencing data sets and how they may be overcome have rarely been addressed. On the example of a nested metabarcoding analysis of 1920 carabid beetle regurgitates to assess plant feeding, we investigated: (i) the variation in sequencing depth of individually tagged samples and the effect of library preparation on the data output; (ii) the influence of sequencing errors within index regions and its consequences for demultiplexing; and (iii) the effect of index hopping. Our results demonstrate that despite library quantification, large variation in read counts and sequencing depth occurred among samples and that the sequencing error rate in bioinformatic software is essential for accurate adapter/primer trimming and demultiplexing. Moreover, setting an index hopping threshold to avoid incorrect assignment of samples is highly recommended.


Nanomaterials ◽  
2021 ◽  
Vol 11 (7) ◽  
pp. 1646
Author(s):  
Jingya Xie ◽  
Wangcheng Ye ◽  
Linjie Zhou ◽  
Xuguang Guo ◽  
Xiaofei Zang ◽  
...  

In the last couple of decades, terahertz (THz) technologies, which lie in the frequency gap between the infrared and microwaves, have been greatly enhanced and investigated due to possible opportunities in a plethora of THz applications, such as imaging, security, and wireless communications. Photonics has led the way to the generation, modulation, and detection of THz waves such as the photomixing technique. In tandem with these investigations, researchers have been exploring ways to use silicon photonics technologies for THz applications to leverage the cost-effective large-scale fabrication and integration opportunities that it would enable. Although silicon photonics has enabled the implementation of a large number of optical components for practical use, for THz integrated systems, we still face several challenges associated with high-quality hybrid silicon lasers, conversion efficiency, device integration, and fabrication. This paper provides an overview of recent progress in THz technologies based on silicon photonics or hybrid silicon photonics, including THz generation, detection, phase modulation, intensity modulation, and passive components. As silicon-based electronic and photonic circuits are further approaching THz frequencies, one single chip with electronics, photonics, and THz functions seems inevitable, resulting in the ultimate dream of a THz electronic–photonic integrated circuit.


2002 ◽  
Vol 06 (24) ◽  
pp. 958-965
Author(s):  
Jun Yu ◽  
Jian Wang ◽  
Huanming Yang

A coordinated international effort to sequence agricultural and livestock genomes has come to its time. While human genome and genomes of many model organisms (related to human health and basic biological interests) have been sequenced or plugged in the sequencing pipelines, agronomically important crop and livestock genomes have not been given high enough priority. Although we are facing many challenges in policy-making, grant funding, regional task emphasis, research community consensus and technology innovations, many initiatives are being announced and formulated based on the cost-effective and large-scale sequencing procedure, known as whole genome shotgun (WGS) sequencing that produces draft sequences covering a genome from 95 percent to 99 percent. Identified genes from such draft sequences, coupled with other resources, such as molecular markers, large-insert clones and cDNA sequences, provide ample information and tools to further our knowledge in agricultural and environmental biology in the genome era that just comes to its accelerated period. If the campaign succeeds, molecular biologists, geneticists and field biologists from all countries, rich or poor, would be brought to the same starting point and expect another astronomical increase of basic genomic information, ready to convert effectively into knowledge that will ultimately change our lives and environment into a greater and better future. We call upon national and international governmental agencies and organizations as well as research foundations to support this unprecedented movement.


2021 ◽  
Author(s):  
Y. Natalia Alfonso ◽  
Adnan A Hyder ◽  
Olakunle Alonge ◽  
Shumona Sharmin Salam ◽  
Kamran Baset ◽  
...  

Abstract Drowning is the leading cause of death among children 12-59 months old in rural Bangladesh. This study evaluated the cost-effectiveness of a large-scale crèche intervention in preventing child drowning. Estimates of the effectiveness of the crèches was based on prior studies and the program cost was assessed using monthly program expenditures captured prospectively throughout the study period from two different implementing agencies. The study evaluated the cost-effectiveness from both a program and societal perspective. Results showed that from the program perspective the annual operating cost of a crèche was $416.35 (95%C.I.: $222 to $576), the annual cost per child was $16 (95%C.I.: $9 to $22) and the incremental-cost-effectiveness ratio (ICER) per life saved with the crèches was $17,803 (95%C.I.: $9,051 to $27,625). From the societal perspective (including parents time valued) the ICER per life saved was -$176,62 (95%C.I.: -$347,091 to -$67,684)—meaning crèches generated net economic benefits per child enrolled. Based on the ICER per disability-adjusted-life years averted from the societal perspective (excluding parents time), $2,020, the crèche intervention was cost-effective even when the societal economic benefits were ignored. Based on the evidence, the creche intervention has great potential for reducing child drowning at a cost that is reasonable.


2019 ◽  
Vol 3 (7) ◽  
pp. 1600-1622 ◽  
Author(s):  
Ji-Lu Zheng ◽  
Ya-Hong Zhu ◽  
Ming-Qiang Zhu ◽  
Kang Kang ◽  
Run-Cang Sun

The commercial production of advanced fuels based on bio-oil gasification could be promising because the cost-effective transport of bio-oil could promote large-scale implementation of this biomass technology.


2019 ◽  
Vol 3 (4) ◽  
pp. 399-409 ◽  
Author(s):  
Brandon Jew ◽  
Jae Hoon Sul

Abstract Next-generation sequencing has allowed genetic studies to collect genome sequencing data from a large number of individuals. However, raw sequencing data are not usually interpretable due to fragmentation of the genome and technical biases; therefore, analysis of these data requires many computational approaches. First, for each sequenced individual, sequencing data are aligned and further processed to account for technical biases. Then, variant calling is performed to obtain information on the positions of genetic variants and their corresponding genotypes. Quality control (QC) is applied to identify individuals and genetic variants with sequencing errors. These procedures are necessary to generate accurate variant calls from sequencing data, and many computational approaches have been developed for these tasks. This review will focus on current widely used approaches for variant calling and QC.


2020 ◽  
Vol 79 (2) ◽  
pp. 105-113
Author(s):  
Abdul Bari Muneera Parveen ◽  
Divya Lakshmanan ◽  
Modhumita Ghosh Dasgupta

The advent of next-generation sequencing has facilitated large-scale discovery and mapping of genomic variants for high-throughput genotyping. Several research groups working in tree species are presently employing next generation sequencing (NGS) platforms for marker discovery, since it is a cost effective and time saving strategy. However, most trees lack a chromosome level genome map and validation of variants for downstream application becomes obligatory. The cost associated with identifying potential variants from the enormous amount of sequence data is a major limitation. In the present study, high resolution melting (HRM) analysis was optimized for rapid validation of single nucleotide polymorphisms (SNPs), insertions or deletions (InDels) and simple sequence repeats (SSRs) predicted from exome sequencing of parents and hybrids of Eucalyptus tereticornis Sm. ? Eucalyptus grandis Hill ex Maiden generated from controlled hybridization. The cost per data point was less than 0.5 USD, providing great flexibility in terms of cost and sensitivity, when compared to other validation methods. The sensitivity of this technology in variant detection can be extended to other applications including Bar-HRM for species authentication and TILLING for detection of mutants.


1997 ◽  
Vol 36 (8-9) ◽  
pp. 307-311 ◽  
Author(s):  
R. Y. G. Andoh ◽  
C. Declerck

Rapid urbanisation and its consequent increase in impermeable surface areas and changes in land use has generally resulted in problems of flooding and heavy pollution of urban streams and other receiving waters. This has often been coupled with ground water depletion and a threat to water resources. The first part of this paper presents an alternative drainage philosophy and strategy which mimics nature's way by slowing down (attenuating) the movement of urban runoff. This approach results in cost-effective, affordable and sustainable drainage schemes. The alternative strategy can be described as one of prevention rather than cure by effecting controls closer to source rather than the traditional approach which results in the transfer of problems downstream, resulting in its cumulation and the need for large scale, centralised control. The second part describes a research project which has been launched in order to quantify the cost and operational benefits of source control and distributed storage. Details of the methodology of the modelling and simulation processes which are being followed to achieve this target are presented.


2018 ◽  
Vol 2018 ◽  
pp. 1-14 ◽  
Author(s):  
Alexandros Tsipianitis ◽  
Yiannis Tsompanakis

Liquid-filled tanks are effective storage infrastructure for water, oil, and liquefied natural gas (LNG). Many such large-scale tanks are located in regions with high seismicity. Therefore, very frequently base isolation technology has to be adopted to reduce the dynamic distress of storage tanks, preventing the structure from typical modes of failure, such as elephant-foot buckling, diamond-shaped buckling, and roof damage caused by liquid sloshing. The cost-effective seismic design of base-isolated liquid storage tanks can be achieved by adopting performance-based design (PBD) principles. In this work, the focus is given on sliding-based systems, namely, single friction pendulum bearings (SFPBs), triple friction pendulum bearings (TFPBs), and mainly on the recently developed quintuple friction pendulum bearings (QFPBs). More specifically, the study is focused on the fragility analysis of tanks isolated by sliding-bearings, emphasizing on isolators’ displacements due to near-fault earthquakes. In addition, a surrogate model has been developed for simulating the dynamic response of the superstructure (tank and liquid content) to achieve an optimal balance between computational efficiency and accuracy.


Sign in / Sign up

Export Citation Format

Share Document