scholarly journals Naught all zeros in sequence count data are the same

2018 ◽  
Author(s):  
Justin D. Silverman ◽  
Kimberly Roche ◽  
Sayan Mukherjee ◽  
Lawrence A. David

AbstractGenomic studies feature multivariate count data from high-throughput DNA sequencing experiments, which often contain many zero values. These zeros can cause artifacts for statistical analyses and multiple modeling approaches have been developed in response. Here, we apply common zero-handling models to gene-expression and microbiome datasets and show models disagree on average by 46% in terms of identifying the most differentially expressed sequences. Next, to rationally examine how different zero handling models behave, we developed a conceptual framework outlining four types of processes that may give rise to zero values in sequence count data. Last, we performed simulations to test how zero handling models behave in the presence of these different zero generating processes. Our simulations showed that simple count models are sufficient across multiple processes, even when the true underlying process is unknown. On the other hand, a common zero handling technique known as “zero-inflation” was only suitable under a zero generating process associated with an unlikely set of biological and experimental conditions. In concert, our work here suggests several specific guidelines for developing and choosing state-of-the-art models for analyzing sparse sequence count data.

Author(s):  
Hideo Nakamura ◽  
Tadashi Watanabe ◽  
Takeshi Takeda ◽  
Hideaki Asaka ◽  
Masaya Kondo ◽  
...  

The Japan Atomic Energy Agency (JAEA) started OECD/NEA ROSA Project in 2005 to resolve issues in the thermal-hydraulic analyses relevant to LWR safety through the experiments of ROSA/LSTF in JAEA. More than 17 organizations from 14 NEA member countries have joined the Project. The ROSA Project intends to focus on the validation of simulation models and methods for complex phenomena that may occur during DBEs and beyond-DBE transients. Twelve experiments are to be conducted in the six types. By utilizing the obtained data, the predictability of codes is validated. Nine experiments have been performed so far in the ROSA Project to date. The results of two out of these experiments; PV top and bottom small-break (SB) LOCA simulations are studied here, through comparisons with the results from pre-test and post-test analyses by using the RELAP5/MOD3.2 code as a typical and well-utilized/improved best estimate (BE) code. The experimental conditions were defined based on the pre-test (blind) analysis. The comparison with the experiment results may clearly indicate a state of the art of the code to deal with relevant reactor accidents. The code predictive capability was verified further through the post-test analysis. The obtained issues in the utilization of the RELAP5 code are summarized as well as the outline of the ROSA Project.


Complexity ◽  
2019 ◽  
Vol 2019 ◽  
pp. 1-12 ◽  
Author(s):  
Adrian Carballal ◽  
Carlos Fernandez-Lozano ◽  
Nereida Rodriguez-Fernandez ◽  
Luz Castro ◽  
Antonino Santos

An important topic in evolutionary art is the development of systems that can mimic the aesthetics decisions made by human begins, e.g., fitness evaluations made by humans using interactive evolution in generative art. This paper focuses on the analysis of several datasets used for aesthetic prediction based on ratings from photography websites and psychological experiments. Since these datasets present problems, we proposed a new dataset that is a subset of DPChallenge.com. Subsequently, three different evaluation methods were considered, one derived from the ratings available at DPChallenge.com and two obtained under experimental conditions related to the aesthetics and quality of images. We observed different criteria in the DPChallenge.com ratings, which had more to do with the photographic quality than with the aesthetic value. Finally, we explored learning systems other than state-of-the-art ones, in order to predict these three values. The obtained results were similar to those using state-of-the-art procedures.


2014 ◽  
Vol 12 (5S) ◽  
pp. 801-803
Author(s):  
John C. Byrd

In the treatment of chronic lymphocytic leukemia (CLL), select genomic studies can assist in risk stratification of newly diagnosed patients. Chemoimmunotherapy targeting CD20 offers a survival advantage in symptomatic patients both with and without these high-risk genetic features, though patients with del(17p13.1) have poor outcomes and require specific intervention. Obinutuzumab plus chlorambucil is a treatment standard for untreated elderly patients and is superior to rituximab plus chlorambucil. In the setting of relapsed CLL, the new kinase inhibitors have the potential to completely change the treatment paradigm of CLL.


1998 ◽  
Vol 9 ◽  
pp. 317-365 ◽  
Author(s):  
G. Di Caro ◽  
M. Dorigo

This paper introduces AntNet, a novel approach to the adaptive learning of routing tables in communications networks. AntNet is a distributed, mobile agents based Monte Carlo system that was inspired by recent work on the ant colony metaphor for solving optimization problems. AntNet's agents concurrently explore the network and exchange collected information. The communication among the agents is indirect and asynchronous, mediated by the network itself. This form of communication is typical of social insects and is called stigmergy. We compare our algorithm with six state-of-the-art routing algorithms coming from the telecommunications and machine learning fields. The algorithms' performance is evaluated over a set of realistic testbeds. We run many experiments over real and artificial IP datagram networks with increasing number of nodes and under several paradigmatic spatial and temporal traffic distributions. Results are very encouraging. AntNet showed superior performance under all the experimental conditions with respect to its competitors. We analyze the main characteristics of the algorithm and try to explain the reasons for its superiority.


2020 ◽  
Vol 32 (3) ◽  
pp. 233-245
Author(s):  
Mark Eramian ◽  
Christopher Power ◽  
Stephen Rau ◽  
Pulkit Khandelwal

Abstract Semi-automated segmentation algorithms hold promise for improving extraction and identification of objects in images such as tumors in medical images of human tissue, counting plants or flowers for crop yield prediction or other tasks where object numbers and appearance vary from image to image. By blending markup from human annotators to algorithmic classifiers, the accuracy and reproducability of image segmentation can be raised to very high levels. At least, that is the promise of this approach, but the reality is less than clear. In this paper, we review the state-of-the-art in semi-automated image segmentation performance assessment and demonstrate it to be lacking the level of experimental rigour needed to ensure that claims about algorithm accuracy and reproducability can be considered valid. We follow this review with two experiments that vary the type of markup that annotators make on images, either points or strokes, in tightly controlled experimental conditions in order to investigate the effect that this one particular source of variation has on the accuracy of these types of systems. In both experiments, we found that accuracy substantially increases when participants use a stroke-based interaction. In light of these results, the validity of claims about algorithm performance are brought into sharp focus, and we reflect on the need for a far more control on variables for benchmarking the impact of annotators and their context on these types of systems.


2017 ◽  
Author(s):  
M. Senthil Kumar ◽  
Eric V. Slud ◽  
Kwame Okrah ◽  
Stephanie C. Hicks ◽  
Sridhar Hannenhalli ◽  
...  

AbstractCount data derived from high-throughput DNA sequencing is frequently used in quantitative molecular assays. Due to properties inherent to the sequencing process, unnormalized count data is compositional, measuring relative and not absolute abundances of the assayed features. This compositional bias confounds inference of absolute abundances. We demonstrate that existing techniques for estimating compositional bias fail with sparse metagenomic 16S count data and propose an empirical Bayes normalization approach to overcome this problem. In addition, we clarify the assumptions underlying frequently used scaling normalization methods in light of compositional bias, including scaling methods that were not designed directly to address it.


2020 ◽  
Author(s):  
Shuyi Zhang ◽  
Jacob R. Leistico ◽  
Christopher Cook ◽  
Yale Liu ◽  
Raymond J. Cho ◽  
...  

Recent advances in next generation sequencing-based single-cell technologies have allowed high-throughput quantitative detection of cell-surface proteins along with the transcriptome in individual cells, extending our understanding of the heterogeneity of cell populations in diverse tissues that are in different diseased states or under different experimental conditions. Count data of surface proteins from the cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) technology pose new computational challenges, and there is currently a dearth of rigorous mathematical tools for analyzing the data. This work utilizes concepts and ideas from Riemannian geometry to remove batch effects between samples and develops a statistical framework for distinguishing positive signals from background noise. The strengths of these approaches are demonstrated on two independent CITE-seq data sets in mouse and human. Python source code implementing the algorithms is available at https://github.com/jssong-lab/SAGACITE.


Cancers ◽  
2021 ◽  
Vol 13 (12) ◽  
pp. 3100
Author(s):  
Giampaolo Morciano ◽  
Bianca Vezzani ◽  
Sonia Missiroli ◽  
Caterina Boncompagni ◽  
Paolo Pinton ◽  
...  

Yes-associated protein (YAP) has emerged as a key component in cancer signaling and is considered a potent oncogene. As such, nuclear YAP participates in complex and only partially understood molecular cascades that are responsible for the oncogenic response by regulating multiple processes, including cell transformation, tumor growth, migration, and metastasis, and by acting as an important mediator of immune and cancer cell interactions. YAP is finely regulated at multiple levels, and its localization in cells in terms of cytoplasm–nucleus shuttling (and vice versa) sheds light on interesting novel anticancer treatment opportunities and putative unconventional functions of the protein when retained in the cytosol. This review aims to summarize and present the state of the art knowledge about the role of YAP in cancer signaling, first focusing on how YAP differs from WW domain-containing transcription regulator 1 (WWTR1, also named as TAZ) and which upstream factors regulate it; then, this review focuses on the role of YAP in different cancer stages and in the crosstalk between immune and cancer cells as well as growing translational strategies derived from its inhibitory and synergistic effects with existing chemo-, immuno- and radiotherapies.


Author(s):  
Palvi Aggarwal ◽  
Omkar Thakoor ◽  
Aditya Mate ◽  
Milind Tambe ◽  
Edward A. Cranford ◽  
...  

During the network reconnaissance process, attackers scan the network to gather information before launching an attack. This is a good chance for defenders to use deception and disrupt the attacker’s learning process. In this paper, we present an exploratory experiment to test the effectiveness of a masking strategy (compared to a random masking strategy) to reduce the utility of attackers. A total of 30 human participants (in the role of attackers) are randomly assigned to one of the two experimental conditions: Optimal or Random (15 in each condition). Attackers appeared to be more successful in launching attacks in the optimal condition compared to the random condition but the total score of attackers was not different from the random masking strategy. Most importantly, we found a generalized tendency to act according to the certainty bias (or risk aversion). These observations will help to improve the current state-of-the-art masking algorithms of cyberdefense.


Sign in / Sign up

Export Citation Format

Share Document