Naught all zeros in sequence count data are the same

Mapping Intimacies ◽

10.1101/477794 ◽

2018 ◽

Cited By ~ 22

Author(s):

Justin D. Silverman ◽

Kimberly Roche ◽

Sayan Mukherjee ◽

Lawrence A. David

Keyword(s):

Count Data ◽

State Of The Art ◽

Common Zero ◽

Experimental Conditions ◽

Multiple Processes ◽

Genomic Studies ◽

High Throughput Dna Sequencing ◽

Zero Values ◽

Handling Technique ◽

Simple Count

AbstractGenomic studies feature multivariate count data from high-throughput DNA sequencing experiments, which often contain many zero values. These zeros can cause artifacts for statistical analyses and multiple modeling approaches have been developed in response. Here, we apply common zero-handling models to gene-expression and microbiome datasets and show models disagree on average by 46% in terms of identifying the most differentially expressed sequences. Next, to rationally examine how different zero handling models behave, we developed a conceptual framework outlining four types of processes that may give rise to zero values in sequence count data. Last, we performed simulations to test how zero handling models behave in the presence of these different zero generating processes. Our simulations showed that simple count models are sufficient across multiple processes, even when the true underlying process is unknown. On the other hand, a common zero handling technique known as “zero-inflation” was only suitable under a zero generating process associated with an unlikely set of biological and experimental conditions. In concert, our work here suggests several specific guidelines for developing and choosing state-of-the-art models for analyzing sparse sequence count data.

Download Full-text

RELAP5/MOD3 Code Verification Through PWR Pressure Vessel Small Break LOCA Tests in OECD/NEA ROSA Project

Volume 3: Thermal Hydraulics; Instrumentation and Controls ◽

10.1115/icone16-48615 ◽

2008 ◽

Cited By ~ 2

Author(s):

Hideo Nakamura ◽

Tadashi Watanabe ◽

Takeshi Takeda ◽

Hideaki Asaka ◽

Masaya Kondo ◽

...

Keyword(s):

State Of The Art ◽

Simulation Models ◽

Predictive Capability ◽

Experimental Conditions ◽

Energy Agency ◽

Test Analysis ◽

Post Test ◽

Complex Phenomena ◽

Reactor Accidents ◽

Relap5 Code

The Japan Atomic Energy Agency (JAEA) started OECD/NEA ROSA Project in 2005 to resolve issues in the thermal-hydraulic analyses relevant to LWR safety through the experiments of ROSA/LSTF in JAEA. More than 17 organizations from 14 NEA member countries have joined the Project. The ROSA Project intends to focus on the validation of simulation models and methods for complex phenomena that may occur during DBEs and beyond-DBE transients. Twelve experiments are to be conducted in the six types. By utilizing the obtained data, the predictability of codes is validated. Nine experiments have been performed so far in the ROSA Project to date. The results of two out of these experiments; PV top and bottom small-break (SB) LOCA simulations are studied here, through comparisons with the results from pre-test and post-test analyses by using the RELAP5/MOD3.2 code as a typical and well-utilized/improved best estimate (BE) code. The experimental conditions were defined based on the pre-test (blind) analysis. The comparison with the experiment results may clearly indicate a state of the art of the code to deal with relevant reactor accidents. The code predictive capability was verified further through the post-test analysis. The obtained issues in the utilization of the RELAP5 code are summarized as well as the outline of the ROSA Project.

Download Full-text

Avoiding the Inherent Limitations in Datasets Used for Measuring Aesthetics When Using a Machine Learning Approach

Complexity ◽

10.1155/2019/4659809 ◽

2019 ◽

Vol 2019 ◽

pp. 1-12 ◽

Cited By ~ 3

Author(s):

Adrian Carballal ◽

Carlos Fernandez-Lozano ◽

Nereida Rodriguez-Fernandez ◽

Luz Castro ◽

Antonino Santos

Keyword(s):

State Of The Art ◽

Learning Approach ◽

Aesthetic Value ◽

Experimental Conditions ◽

Generative Art ◽

Machine Learning Approach ◽

Evolutionary Art ◽

The Aesthetic ◽

Psychological Experiments

An important topic in evolutionary art is the development of systems that can mimic the aesthetics decisions made by human begins, e.g., fitness evaluations made by humans using interactive evolution in generative art. This paper focuses on the analysis of several datasets used for aesthetic prediction based on ratings from photography websites and psychological experiments. Since these datasets present problems, we proposed a new dataset that is a subset of DPChallenge.com. Subsequently, three different evaluation methods were considered, one derived from the ratings available at DPChallenge.com and two obtained under experimental conditions related to the aesthetics and quality of images. We observed different criteria in the DPChallenge.com ratings, which had more to do with the photographic quality than with the aesthetic value. Finally, we explored learning systems other than state-of-the-art ones, in order to predict these three values. The obtained results were similar to those using state-of-the-art procedures.

Download Full-text

Chronic Lymphocytic Leukemia: State of the Art and Beyond

Journal of the National Comprehensive Cancer Network ◽

10.6004/jnccn.2014.0194 ◽

2014 ◽

Vol 12 (5S) ◽

pp. 801-803

Author(s):

John C. Byrd

Keyword(s):

Chronic Lymphocytic Leukemia ◽

Kinase Inhibitors ◽

State Of The Art ◽

Lymphocytic Leukemia ◽

Newly Diagnosed ◽

Treatment Standard ◽

Genetic Features ◽

Treatment Paradigm ◽

Genomic Studies ◽

Poor Outcomes

In the treatment of chronic lymphocytic leukemia (CLL), select genomic studies can assist in risk stratification of newly diagnosed patients. Chemoimmunotherapy targeting CD20 offers a survival advantage in symptomatic patients both with and without these high-risk genetic features, though patients with del(17p13.1) have poor outcomes and require specific intervention. Obinutuzumab plus chlorambucil is a treatment standard for untreated elderly patients and is superior to rituximab plus chlorambucil. In the setting of relapsed CLL, the new kinase inhibitors have the potential to completely change the treatment paradigm of CLL.

Download Full-text

AntNet: Distributed Stigmergetic Control for Communications Networks

Journal of Artificial Intelligence Research ◽

10.1613/jair.530 ◽

1998 ◽

Vol 9 ◽

pp. 317-365 ◽

Cited By ~ 870

Author(s):

G. Di Caro ◽

M. Dorigo

Keyword(s):

Adaptive Learning ◽

Mobile Agents ◽

Optimization Problems ◽

State Of The Art ◽

Routing Algorithms ◽

Superior Performance ◽

Experimental Conditions ◽

Communications Networks ◽

Novel Approach ◽

Routing Tables

This paper introduces AntNet, a novel approach to the adaptive learning of routing tables in communications networks. AntNet is a distributed, mobile agents based Monte Carlo system that was inspired by recent work on the ant colony metaphor for solving optimization problems. AntNet's agents concurrently explore the network and exchange collected information. The communication among the agents is indirect and asynchronous, mediated by the network itself. This form of communication is typical of social insects and is called stigmergy. We compare our algorithm with six state-of-the-art routing algorithms coming from the telecommunications and machine learning fields. The algorithms' performance is evaluated over a set of realistic testbeds. We run many experiments over real and artificial IP datagram networks with increasing number of nodes and under several paradigmatic spatial and temporal traffic distributions. Results are very encouraging. AntNet showed superior performance under all the experimental conditions with respect to its competitors. We analyze the main characteristics of the algorithm and try to explain the reasons for its superiority.

Download Full-text

Benchmarking Human Performance in Semi-Automated Image Segmentation

Interacting with Computers ◽

10.1093/iwcomp/iwaa017 ◽

2020 ◽

Vol 32 (3) ◽

pp. 233-245

Author(s):

Mark Eramian ◽

Christopher Power ◽

Stephen Rau ◽

Pulkit Khandelwal

Keyword(s):

Image Segmentation ◽

Human Performance ◽

State Of The Art ◽

Automated Segmentation ◽

Experimental Conditions ◽

Algorithm Performance ◽

Sharp Focus ◽

Segmentation Algorithms ◽

The Impact ◽

Very High

Abstract Semi-automated segmentation algorithms hold promise for improving extraction and identification of objects in images such as tumors in medical images of human tissue, counting plants or flowers for crop yield prediction or other tasks where object numbers and appearance vary from image to image. By blending markup from human annotators to algorithmic classifiers, the accuracy and reproducability of image segmentation can be raised to very high levels. At least, that is the promise of this approach, but the reality is less than clear. In this paper, we review the state-of-the-art in semi-automated image segmentation performance assessment and demonstrate it to be lacking the level of experimental rigour needed to ensure that claims about algorithm accuracy and reproducability can be considered valid. We follow this review with two experiments that vary the type of markup that annotators make on images, either points or strokes, in tightly controlled experimental conditions in order to investigate the effect that this one particular source of variation has on the accuracy of these types of systems. In both experiments, we found that accuracy substantially increases when participants use a stroke-based interaction. In light of these results, the validity of claims about algorithm performance are brought into sharp focus, and we reflect on the need for a far more control on variables for benchmarking the impact of annotators and their context on these types of systems.

Download Full-text

Modeling and comparison of count data containing zero values: a case study of Setipinna taty in the south inshore of Zhejiang, China

Environmental Science and Pollution Research ◽

10.1007/s11356-021-13440-5 ◽

2021 ◽

Author(s):

Xiaoxue Liu ◽

Chunxia Gao ◽

Jing Zhao ◽

Siquan Tian ◽

Shen Ye ◽

...

Keyword(s):

Count Data ◽

The South ◽

Zero Values

Download Full-text

Analysis and correction of compositional bias in sparse sequencing count data

10.1101/142851 ◽

2017 ◽

Cited By ~ 2

Author(s):

M. Senthil Kumar ◽

Eric V. Slud ◽

Kwame Okrah ◽

Stephanie C. Hicks ◽

Sridhar Hannenhalli ◽

...

Keyword(s):

Dna Sequencing ◽

High Throughput ◽

Count Data ◽

Empirical Bayes ◽

Compositional Bias ◽

Molecular Assays ◽

Normalization Methods ◽

Scaling Methods ◽

High Throughput Dna Sequencing ◽

Sequencing Process

AbstractCount data derived from high-throughput DNA sequencing is frequently used in quantitative molecular assays. Due to properties inherent to the sequencing process, unnormalized count data is compositional, measuring relative and not absolute abundances of the assayed features. This compositional bias confounds inference of absolute abundances. We demonstrate that existing techniques for estimating compositional bias fail with sparse metagenomic 16S count data and propose an empirical Bayes normalization approach to overcome this problem. In addition, we clarify the assumptions underlying frequently used scaling normalization methods in light of compositional bias, including scaling methods that were not designed directly to address it.

Download Full-text

Riemannian geometry and statistical modeling correct for batch effects and control false discoveries in single-cell surface protein count data from CITE-seq

10.1101/2020.04.28.067306 ◽

2020 ◽

Author(s):

Shuyi Zhang ◽

Jacob R. Leistico ◽

Christopher Cook ◽

Yale Liu ◽

Raymond J. Cho ◽

...

Keyword(s):

Cell Surface ◽

Single Cell ◽

Count Data ◽

Riemannian Geometry ◽

Surface Protein ◽

Surface Proteins ◽

Quantitative Detection ◽

Data Sets ◽

Batch Effects ◽

Experimental Conditions

Recent advances in next generation sequencing-based single-cell technologies have allowed high-throughput quantitative detection of cell-surface proteins along with the transcriptome in individual cells, extending our understanding of the heterogeneity of cell populations in diverse tissues that are in different diseased states or under different experimental conditions. Count data of surface proteins from the cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) technology pose new computational challenges, and there is currently a dearth of rigorous mathematical tools for analyzing the data. This work utilizes concepts and ideas from Riemannian geometry to remove batch effects between samples and develops a statistical framework for distinguishing positive signals from background noise. The strengths of these approaches are demonstrated on two independent CITE-seq data sets in mouse and human. Python source code implementing the algorithms is available at https://github.com/jssong-lab/SAGACITE.

Download Full-text

An Updated Understanding of the Role of YAP in Driving Oncogenic Responses

Cancers ◽

10.3390/cancers13123100 ◽

2021 ◽

Vol 13 (12) ◽

pp. 3100

Author(s):

Giampaolo Morciano ◽

Bianca Vezzani ◽

Sonia Missiroli ◽

Caterina Boncompagni ◽

Paolo Pinton ◽

...

Keyword(s):

Cell Transformation ◽

State Of The Art ◽

Synergistic Effects ◽

Cancer Signaling ◽

Important Mediator ◽

Multiple Processes ◽

Anticancer Treatment ◽

Multiple Levels ◽

Translational Strategies

Yes-associated protein (YAP) has emerged as a key component in cancer signaling and is considered a potent oncogene. As such, nuclear YAP participates in complex and only partially understood molecular cascades that are responsible for the oncogenic response by regulating multiple processes, including cell transformation, tumor growth, migration, and metastasis, and by acting as an important mediator of immune and cancer cell interactions. YAP is finely regulated at multiple levels, and its localization in cells in terms of cytoplasm–nucleus shuttling (and vice versa) sheds light on interesting novel anticancer treatment opportunities and putative unconventional functions of the protein when retained in the cytosol. This review aims to summarize and present the state of the art knowledge about the role of YAP in cancer signaling, first focusing on how YAP differs from WW domain-containing transcription regulator 1 (WWTR1, also named as TAZ) and which upstream factors regulate it; then, this review focuses on the role of YAP in different cancer stages and in the crosstalk between immune and cancer cells as well as growing translational strategies derived from its inhibitory and synergistic effects with existing chemo-, immuno- and radiotherapies.

Download Full-text

An Exploratory Study of a Masking Strategy of Cyberdeception Using CyberVAN

Proceedings of the Human Factors and Ergonomics Society Annual Meeting ◽

10.1177/1071181320641100 ◽

2020 ◽

Vol 64 (1) ◽

pp. 446-450

Author(s):

Palvi Aggarwal ◽

Omkar Thakoor ◽

Aditya Mate ◽

Milind Tambe ◽

Edward A. Cranford ◽

...

Keyword(s):

Risk Aversion ◽

Learning Process ◽

Exploratory Study ◽

State Of The Art ◽

Random Condition ◽

Experimental Conditions ◽

Current State ◽

Exploratory Experiment ◽

Human Participants

During the network reconnaissance process, attackers scan the network to gather information before launching an attack. This is a good chance for defenders to use deception and disrupt the attacker’s learning process. In this paper, we present an exploratory experiment to test the effectiveness of a masking strategy (compared to a random masking strategy) to reduce the utility of attackers. A total of 30 human participants (in the role of attackers) are randomly assigned to one of the two experimental conditions: Optimal or Random (15 in each condition). Attackers appeared to be more successful in launching attacks in the optimal condition compared to the random condition but the total score of attackers was not different from the random masking strategy. Most importantly, we found a generalized tendency to act according to the certainty bias (or risk aversion). These observations will help to improve the current state-of-the-art masking algorithms of cyberdefense.

Download Full-text