scholarly journals Identification and Removal of Doublets with DoubletDecon

2020 ◽  
Author(s):  
Erica A. K. DePasquale ◽  
Daniel Schnell ◽  
Kashish Chetal ◽  
Nathan Salomonisi

SUMMARYRetention of multiplet captures in single-cell RNA-sequencing (scRNA-seq) data can hinder identification of discrete or transitional cell populations and associated marker genes. To overcome this challenge, we created DoubletDecon to identify and remove doublets, multiplets of two cells, by using a combination of deconvolution to identify putative doublets and analyses of unique gene expression. Here we provide the protocol for running DoubletDecon on scRNA-seq data.For complete details on the use of this protocol, please see DePasquale et al. (2019) (https://doi.org/10.1016/j.celrep.2019.09.082).GRAPHICAL ABSTRACT

2020 ◽  
Author(s):  
Min Feng ◽  
Junming Xia ◽  
Shigang Fei ◽  
Xiong Wang ◽  
Yaohong Zhou ◽  
...  

AbstractA wide range of hemocyte types exist in insects but a full definition of the different subclasses is not yet established. The current knowledge of the classification of silkworm hemocytes mainly comes from morphology rather than specific markers, so our understanding of the detailed classification, hemocyte lineage and functions of silkworm hemocytes is very incomplete. Bombyx mori nucleopolyhedrovirus (BmNPV) is a representative member of the baculoviruses, which are a major pathogens that specifically infects silkworms and cause serious loss in sericulture industry. Here, we performed single-cell RNA sequencing (scRNA-seq) of silkworm hemocytes in BmNPV and mock-infected larvae to comprehensively identify silkworm hemocyte subsets and determined specific molecular and cellular characteristics in each hemocyte subset before and after viral infection. A total of 19 cell clusters and their potential marker genes were identified in silkworm hemocytes. Among these hemocyte clusters, clusters 0, 1, 2, 5 and 9 might be granulocytes (GR); clusters 14 and 17 were predicted as plasmatocytes (PL); cluster 18 was tentatively identified as spherulocytes (SP); and clusters 7 and 11 could possibly correspond to oenocytoids (OE). In addition, all of the hemocyte clusters were infected by BmNPV and some infected cells carried high viral-load in silkworm larvae at 3 day post infection (dpi). Interestingly, BmNPV infection can cause severe and diverse changes in gene expression in hemocytes. Cells belonging to the infection group mainly located at the early stage of the pseudotime trajectories. Furthermore, we found that BmNPV infection suppresses the immune response in the major hemocyte types. In summary, our scRNA-seq analysis revealed the diversity of silkworm hemocytes and provided a rich resource of gene expression profiles for a systems-level understanding of their functions in the uninfected condition and as a response to BmNPV.


Oncogene ◽  
2021 ◽  
Author(s):  
Philip Bischoff ◽  
Alexandra Trinks ◽  
Benedikt Obermayer ◽  
Jan Patrick Pett ◽  
Jennifer Wiederspahn ◽  
...  

AbstractRecent developments in immuno-oncology demonstrate that not only cancer cells, but also the tumor microenvironment can guide precision medicine. A comprehensive and in-depth characterization of the tumor microenvironment is challenging since its cell populations are diverse and can be important even if scarce. To identify clinically relevant microenvironmental and cancer features, we applied single-cell RNA sequencing to ten human lung adenocarcinomas and ten normal control tissues. Our analyses revealed heterogeneous carcinoma cell transcriptomes reflecting histological grade and oncogenic pathway activities, and two distinct microenvironmental patterns. The immune-activated CP²E microenvironment was composed of cancer-associated myofibroblasts, proinflammatory monocyte-derived macrophages, plasmacytoid dendritic cells and exhausted CD8+ T cells, and was prognostically unfavorable. In contrast, the inert N³MC microenvironment was characterized by normal-like myofibroblasts, non-inflammatory monocyte-derived macrophages, NK cells, myeloid dendritic cells and conventional T cells, and was associated with a favorable prognosis. Microenvironmental marker genes and signatures identified in single-cell profiles had progonostic value in bulk tumor profiles. In summary, single-cell RNA profiling of lung adenocarcinoma provides additional prognostic information based on the microenvironment, and may help to predict therapy response and to reveal possible target cell populations for future therapeutic approaches.


Author(s):  
Xiaojun Yuan ◽  
Janith A. Seneviratne ◽  
Shibei Du ◽  
Ying Xu ◽  
Yijun Chen ◽  
...  

AbstractPeripheral neuroblastic tumors (PNTs) are the most common extracranial solid tumors in early childhood. They represent a spectrum of neural crest derived tumors including neuroblastoma, ganglioneuroblastoma and ganglioneuroma. PNTs exhibit heterogeneity due to interconverting malignant cell states described as adrenergic/nor-adrenergic or mesenchymal/neural crest cell in origin. The factors determining individual patient levels of tumor heterogeneity, their impact on the malignant phenotype, and the presence of other cell states are unknown. Here, single-cell RNA-sequencing analysis of 4267 cells from 7 PNTs demonstrated extensive transcriptomic heterogeneity. Trajectory modelling showed that malignant neuroblasts move between adrenergic and mesenchymal cell states via a novel state that we termed a “transitional” phenotype. Transitional cells are characterized by gene expression programs linked to a sympathoadrenal development, and aggressive tumor phenotypes such as rapid proliferation and tumor dissemination. Among primary bulk tumor patient cohorts, high expression of the transitional gene signature was highly predictive of poor prognosis when compared to adrenergic and mesenchymal expression patterns. High transitional gene expression in neuroblastoma cell lines identified a similar transitional H3K27-acetylation super-enhancer landscape, supporting the concept that PNTs have phenotypic plasticity and transdifferentiation capacity. Additionally, examination of PNT microenvironments, found that neuroblastomas contained low immune cell infiltration, high levels of non-inflammatory macrophages, and low cytotoxic T lymphocyte levels compared with more benign PNT subtypes. Modeling of cell-cell signaling in the tumor microenvironment predicted specific paracrine effects toward the various subtypes of malignant cells, suggesting further cell-extrinsic influences on malignant cell phenotype. Collectively, our study reveals the presence of a previously unrecognized transitional cell state with high malignant potential and an immune cell architecture which serve both as potential biomarkers and therapeutic targets.


2021 ◽  
Author(s):  
Manman Dai ◽  
Min Feng ◽  
Ziwei Li ◽  
Weisan Chen ◽  
Ming Liao

ABSTRACTChicken peripheral blood lymphocytes (PBLs) exhibit wide-ranging cell types, but current understanding of their subclasses, immune cell classification, and function is limited and incomplete. Previously, we found that viremia caused by avian leukosis virus subgroup J (ALV‐J) was eliminated by 21 days post infection (DPI), accompanied by increased CD8+ T cell ratio in PBLs and low antibody levels. Here we performed single-cell RNA sequencing (scRNA-seq) of PBLs in ALV-J infected and control chickens at 21 DPI to determine chicken PBL subsets and their specific molecular and cellular characteristics, before and after viral infection. Eight cell clusters and their potential marker genes were identified in chicken PBLs. T cell populations (clusters 6 and 7) had the strongest response to ALV-J infection at 21 DPI, based on detection of the largest number of differentially expressed genes (DEGs). T cell populations of clusters 6 and 7 could be further divided into four subsets: activated CD4+ T cells (cluster A0), Th1-like cells (cluster A2), Th2-like cells (cluster A1), and cytotoxic CD8+ T cells. Hallmark genes for each T cell subset response to viral infection were initially identified. Furthermore, pseudotime analysis results suggested that chicken CD4+ T cells could potentially differentiate into Th1-like and Th2-like cells. Moreover, ALV-J infection probably induced CD4+ T cell differentiation into Th1-like cells in which the most immune related DEGs were detected. With respect to the control group, ALV-J infection also had an obvious impact on PBL cell composition. B cells showed inconspicuous response and their numbers decreased in PBLs of the ALV-J infected chickens at 21 DPI. Percentages of cytotoxic Th1-like cells and CD8+ T cells were increased in the T cell population of PBLs from ALV-J infected chicken, which were potentially key mitigating factors against ALV-J infection. More importantly, our results provided a rich resource of gene expression profiles of chicken PBL subsets for a systems-level understanding of their function in homeostatic condition as well as in response to viral infection.


2021 ◽  
Author(s):  
Elnaz Mirzaei Mehrabad ◽  
Aditya Bhaskara ◽  
Benjamin T. Spike

AbstractMotivationSingle cell RNA sequencing (scRNA-seq) is a powerful gene expression profiling technique that is presently revolutionizing the study of complex cellular systems in the biological sciences. Existing single-cell RNA-sequencing methods suffer from sub-optimal target recovery leading to inaccurate measurements including many false negatives. The resulting ‘zero-inflated’ data may confound data interpretation and visualization.ResultsSince cells have coherent phenotypes defined by conserved molecular circuitries (i.e. multiple gene products working together) and since similar cells utilize similar circuits, information about each each expression value or ‘node’ in a multi-cell, multi-gene scRNA-Seq data set is expected to also be predictable from other nodes in the data set. Based on this logic, several approaches have been proposed to impute missing values by extracting information from non-zero measurements in a data set. In this study, we applied non-negative matrix factorization approaches to a selection of published scRNASeq data sets to recommend new values where original measurements are likely to be inaccurate and where ‘zero’ measurements are predicted to be false negatives. The resulting imputed data model predicts novel cell type markers and expression patterns more closely matching gene expression values from orthogonal measurements and/or predicted literature than the values obtained from other previously published imputation [email protected] and implementationFIESTA is written in R and is available at https://github.com/elnazmirzaei/FIESTA and https://github.com/TheSpikeLab/FIESTA.


2018 ◽  
Vol 115 (28) ◽  
pp. E6437-E6446 ◽  
Author(s):  
Jingshu Wang ◽  
Mo Huang ◽  
Eduardo Torre ◽  
Hannah Dueck ◽  
Sydney Shaffer ◽  
...  

Single-cell RNA sequencing (scRNA-seq) enables the quantification of each gene’s expression distribution across cells, thus allowing the assessment of the dispersion, nonzero fraction, and other aspects of its distribution beyond the mean. These statistical characterizations of the gene expression distribution are critical for understanding expression variation and for selecting marker genes for population heterogeneity. However, scRNA-seq data are noisy, with each cell typically sequenced at low coverage, thus making it difficult to infer properties of the gene expression distribution from raw counts. Based on a reexamination of nine public datasets, we propose a simple technical noise model for scRNA-seq data with unique molecular identifiers (UMI). We develop deconvolution of single-cell expression distribution (DESCEND), a method that deconvolves the true cross-cell gene expression distribution from observed scRNA-seq counts, leading to improved estimates of properties of the distribution such as dispersion and nonzero fraction. DESCEND can adjust for cell-level covariates such as cell size, cell cycle, and batch effects. DESCEND’s noise model and estimation accuracy are further evaluated through comparisons to RNA FISH data, through data splitting and simulations and through its effectiveness in removing known batch effects. We demonstrate how DESCEND can clarify and improve downstream analyses such as finding differentially expressed genes, identifying cell types, and selecting differentiation markers.


2017 ◽  
Author(s):  
Jingshu Wang ◽  
Mo Huang ◽  
Eduardo Torre ◽  
Hannah Dueck ◽  
Sydney Shaffer ◽  
...  

AbstractSingle-cell RNA sequencing (scRNA-seq) enables the quantification of each gene’s expression distribution across cells, thus allowing the assessment of the dispersion, burstiness, and other aspects of its distribution beyond the mean. These statistical characterizations of the gene expression distribution are critical for understanding expression variation and for selecting marker genes for population heterogeneity. However, scRNA-seq data is noisy, with each cell typically sequenced at low coverage, thus making it difficult to infer properties of the gene expression distribution from raw counts. Based on a re-examination of 9 public data sets, we propose a simple technical noise model for scRNA-seq data with Unique Molecular Identifiers (UMI). We develop DESCEND, a method that deconvolves the true cross-cell gene expression distribution from observed scRNA-seq counts, leading to improved estimates of properties of the distribution such as dispersion and burstiness. DESCEND can adjust for cell-level covariates such as cell size, cell cycle and batch effects. DESCEND’s noise model and estimation accuracy are further evaluated through comparisons to RNA FISH data, through data splitting and simulations, and through its effectiveness in removing known batch effects. We demonstrate how DESCEND can clarify and improve downstream analyses such as finding differentially bursty genes, identifying cell types, and selecting differentiation markers.


Sign in / Sign up

Export Citation Format

Share Document