scholarly journals alona: a web server for single-cell RNA-seq analysis

2020 ◽  
Vol 36 (12) ◽  
pp. 3910-3912 ◽  
Author(s):  
Oscar Franzén ◽  
Johan L M Björkegren

Abstract Summary Single-cell RNA sequencing (scRNA-seq) is a technology to measure gene expression in single cells. It has enabled discovery of new cell types and established cell type atlases of tissues and organs. The widespread adoption of scRNA-seq has created a need for user-friendly software for data analysis. We have developed a web server, alona that incorporates several of the most popular single-cell analysis algorithms into a flexible pipeline. alona can perform quality filtering, normalization, batch correction, clustering, cell type annotation and differential gene expression analysis. Data are visualized in the web browser using an interface based on JavaScript, allowing the user to query genes of interest and visualize the cluster structure. alona accepts a compressed gene expression matrix and identifies cell clusters with a graph-based clustering strategy. Cell types are identified from a comprehensive collection of marker genes or by specifying a custom set of marker genes. Availability and implementation The service runs at https://alona.panglaodb.se and the Python package can be downloaded from https://oscar-franzen.github.io/adobo/. Supplementary information Supplementary data are available at Bioinformatics online.

2020 ◽  
Vol 176 (2) ◽  
pp. 396-409
Author(s):  
Kelly M Bakulski ◽  
John F Dou ◽  
Robert C Thompson ◽  
Christopher Lee ◽  
Lauren Y Middleton ◽  
...  

Abstract Lead (Pb) exposure is ubiquitous with permanent neurodevelopmental effects. The hippocampus brain region is involved in learning and memory with heterogeneous cellular composition. The hippocampus cell type-specific responses to Pb are unknown. The objective of this study is to examine perinatal Pb treatment effects on adult hippocampus gene expression, at the level of individual cells. In mice perinatally exposed to control water or a human physiologically relevant level (32 ppm in maternal drinking water) of Pb, 2 weeks prior to mating through weaning, we tested for hippocampus gene expression and cellular differences at 5 months of age. We sequenced RNA from 5258 hippocampal cells to (1) test for treatment gene expression differences averaged across all cells, (2) compare cell cluster composition by treatment, and (3) test for treatment gene expression and pathway differences within cell clusters. Gene expression patterns revealed 12 hippocampus cell clusters, mapping to major expected cell types (eg, microglia, astrocytes, neurons, and oligodendrocytes). Perinatal Pb treatment was associated with 12.4% more oligodendrocytes (p = 4.4 × 10−21) in adult mice. Across all cells, Pb treatment was associated with expression of cell cluster marker genes. Within cell clusters, Pb treatment (q < 0.05) caused differential gene expression in endothelial, microglial, pericyte, and astrocyte cells. Pb treatment upregulated protein folding pathways in microglia (p = 3.4 × 10−9) and stress response in oligodendrocytes (p = 3.2 × 10−5). Bulk tissue analysis may be influenced by changes in cell type composition, obscuring effects within vulnerable cell types. This study serves as a biological reference for future single-cell toxicant studies, to ultimately characterize molecular effects on cognition and behavior.


2018 ◽  
Author(s):  
Douglas Abrams ◽  
Parveen Kumar ◽  
R. Krishna Murthy Karuturi ◽  
Joshy George

AbstractBackgroundThe advent of single cell RNA sequencing (scRNA-seq) enabled researchers to study transcriptomic activity within individual cells and identify inherent cell types in the sample. Although numerous computational tools have been developed to analyze single cell transcriptomes, there are no published studies and analytical packages available to guide experimental design and to devise suitable analysis procedure for cell type identification.ResultsWe have developed an empirical methodology to address this important gap in single cell experimental design and analysis into an easy-to-use tool called SCEED (Single Cell Empirical Experimental Design and analysis). With SCEED, user can choose a variety of combinations of tools for analysis, conduct performance analysis of analytical procedures and choose the best procedure, and estimate sample size (number of cells to be profiled) required for a given analytical procedure at varying levels of cell type rarity and other experimental parameters. Using SCEED, we examined 3 single cell algorithms using 48 simulated single cell datasets that were generated for varying number of cell types and their proportions, number of genes expressed per cell, number of marker genes and their fold change, and number of single cells successfully profiled in the experiment.ConclusionsBased on our study, we found that when marker genes are expressed at fold change of 4 or more than the rest of the genes, either Seurat or Simlr algorithm can be used to analyze single cell dataset for any number of single cells isolated (minimum 1000 single cells were tested). However, when marker genes are expected to be only up to fC 2 upregulated, choice of the single cell algorithm is dependent on the number of single cells isolated and proportion of rare cell type to be identified. In conclusion, our work allows the assessment of various single cell methods and also aids in examining the single cell experimental design.


2021 ◽  
Author(s):  
Julia Eve Olivieri ◽  
Roozbeh Dehghannasiri ◽  
Peter Wang ◽  
SoRi Jang ◽  
Antoine de Morree ◽  
...  

More than 95% of human genes are alternatively spliced. Yet, the extent splicing is regulated at single-cell resolution has remained controversial due to both available data and methods to interpret it. We apply the SpliZ, a new statistical approach that is agnostic to transcript annotation, to detect cell-type-specific regulated splicing in > 110K carefully annotated single cells from 12 human tissues. Using 10x data for discovery, 9.1% of genes with computable SpliZ scores are cell-type specifically spliced. These results are validated with RNA FISH, single cell PCR, and in high throughput with Smart-seq2. Regulated splicing is found in ubiquitously expressed genes such as actin light chain subunit MYL6 and ribosomal protein RPS24, which has an epithelial-specific microexon. 13% of the statistically most variable splice sites in cell-type specifically regulated genes are also most variable in mouse lemur or mouse. SpliZ analysis further reveals 170 genes with regulated splicing during sperm development using, 10 of which are conserved in mouse and mouse lemur. The statistical properties of the SpliZ allow model-based identification of subpopulations within otherwise indistinguishable cells based on gene expression, illustrated by subpopulations of classical monocytes with stereotyped splicing, including an un-annotated exon, in SAT1, a Diamine acetyltransferase. Together, this unsupervised and annotation-free analysis of differential splicing in ultra high throughput droplet-based sequencing of human cells across multiple organs establishes splicing is regulated cell-type-specifically independent of gene expression.


2018 ◽  
Author(s):  
Nikos Konstantinides ◽  
Katarina Kapuralin ◽  
Chaimaa Fadil ◽  
Luendreo Barboza ◽  
Rahul Satija ◽  
...  

SummaryTranscription factors regulate the molecular, morphological, and physiological characters of neurons and generate their impressive cell type diversity. To gain insight into general principles that govern how transcription factors regulate cell type diversity, we used large-scale single-cell mRNA sequencing to characterize the extensive cellular diversity in the Drosophila optic lobes. We sequenced 55,000 single optic lobe neurons and glia and assigned them to 52 clusters of transcriptionally distinct single cells. We validated the clustering and annotated many of the clusters using RNA sequencing of characterized FACS-sorted single cell types, as well as marker genes specific to given clusters. To identify transcription factors responsible for inducing specific terminal differentiation features, we used machine-learning to generate a ‘random forest’ model. The predictive power of the model was confirmed by showing that two transcription factors expressed specifically in cholinergic (apterous) and glutamatergic (traffic-jam) neurons are necessary for the expression of ChAT and VGlut in many, but not all, cholinergic or glutamatergic neurons, respectively. We used a transcriptome-wide approach to show that the same terminal characters, including but not restricted to neurotransmitter identity, can be regulated by different transcription factors in different cell types, arguing for extensive phenotypic convergence. Our data provide a deep understanding of the developmental and functional specification of a complex brain structure.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Tianyuan Lu ◽  
Jessica C. Mar

Abstract Background It is a long established fact that sex is an important factor that influences the transcriptional regulatory processes of an organism. However, understanding sex-based differences in gene expression has been limited because existing studies typically sequence and analyze bulk tissue from female or male individuals. Such analyses average cell-specific gene expression levels where cell-to-cell variation can easily be concealed. We therefore sought to utilize data generated by the rapidly developing single cell RNA sequencing (scRNA-seq) technology to explore sex dimorphism and its functional consequences at the single cell level. Methods Our study included scRNA-seq data of ten well-defined cell types from the brain and heart of female and male young adult mice in the publicly available tissue atlas dataset, Tabula Muris. We combined standard differential expression analysis with the identification of differential distributions in single cell transcriptomes to test for sex-based gene expression differences in each cell type. The marker genes that had sex-specific inter-cellular changes in gene expression formed the basis for further characterization of the cellular functions that were differentially regulated between the female and male cells. We also inferred activities of transcription factor-driven gene regulatory networks by leveraging knowledge of multidimensional protein-to-genome and protein-to-protein interactions and analyzed pathways that were potential modulators of sex differentiation and dimorphism. Results For each cell type in this study, we identified marker genes with significantly different mean expression levels or inter-cellular distribution characteristics between female and male cells. These marker genes were enriched in pathways that were closely related to the biological functions of each cell type. We also identified sub-cell types that possibly carry out distinct biological functions that displayed discrepancies between female and male cells. Additionally, we found that while genes under differential transcriptional regulation exhibited strong cell type specificity, six core transcription factor families responsible for most sex-dimorphic transcriptional regulation activities were conserved across the cell types, including ASCL2, EGR, GABPA, KLF/SP, RXRα, and ZF. Conclusions We explored novel gene expression-based biomarkers, functional cell group compositions, and transcriptional regulatory networks associated with sex dimorphism with a novel computational pipeline. Our findings indicated that sex dimorphism might be widespread across the transcriptomes of cell types, cell type-specific, and impactful for regulating cellular activities.


2020 ◽  
Author(s):  
Ying Lei ◽  
Mengnan Cheng ◽  
Zihao Li ◽  
Zhenkun Zhuang ◽  
Liang Wu ◽  
...  

Non-human primates (NHP) provide a unique opportunity to study human neurological diseases, yet detailed characterization of the cell types and transcriptional regulatory features in the NHP brain is lacking. We applied a combinatorial indexing assay, sci-ATAC-seq, as well as single-nuclei RNA-seq, to profile chromatin accessibility in 43,793 single cells and transcriptomics in 11,477 cells, respectively, from prefrontal cortex, primary motor cortex and the primary visual cortex of adult cynomolgus monkey Macaca fascularis. Integrative analysis of these two datasets, resolved regulatory elements and transcription factors that specify cell type distinctions, and discovered area-specific diversity in chromatin accessibility and gene expression within excitatory neurons. We also constructed the dynamic landscape of chromatin accessibility and gene expression of oligodendrocyte maturation to characterize adult remyelination. Furthermore, we identified cell type-specific enrichment of differentially spliced gene isoforms and disease-associated single nucleotide polymorphisms. Our datasets permit integrative exploration of complex regulatory dynamics in macaque brain tissue at single-cell resolution.


2019 ◽  
Author(s):  
Kelly M. Bakulski ◽  
John F. Dou ◽  
Robert C. Thompson ◽  
Christopher Lee ◽  
Lauren Y. Middleton ◽  
...  

AbstractBackgroundLead (Pb) exposure is ubiquitous and has permanent developmental effects on childhood intelligence and behavior and adulthood risk of dementia. The hippocampus is a key brain region involved in learning and memory, and its cellular composition is highly heterogeneous. Pb acts on the hippocampus by altering gene expression, but the cell type-specific responses are unknown.ObjectiveExamine the effects of perinatal Pb treatment on adult hippocampus gene expression, at the level of individual cells, in mice.MethodsIn mice perinatally exposed to control water (n=4) or a human physiologically-relevant level (32 ppm in maternal drinking water) of Pb (n=4), two weeks prior to mating through weaning, we tested for gene expression and cellular differences in the hippocampus at 5-months of age. Analysis was performed using single cell RNA-sequencing of 5,258 cells from the hippocampus by 10x Genomics Chromium to 1) test for gene expression differences averaged across all cells by treatment; 2) compare cell cluster composition by treatment; and 3) test for gene expression and pathway differences within cell clusters by treatment.ResultsGene expression patterns revealed 12 cell clusters in the hippocampus, mapping to major expected cell types (e.g. microglia, astrocytes, neurons, oligodendrocytes). Perinatal Pb treatment was associated with 12.4% more oligodendrocytes (P=4.4×10−21) in adult mice. Across all cells, differential gene expression analysis by Pb treatment revealed cluster marker genes. Within cell clusters, differential gene expression with Pb treatment (q<0.05) was observed in endothelial, microglial, pericyte, and astrocyte cells. Pathways up-regulated with Pb treatment were protein folding in microglia (P=3.4×10−9) and stress response in oligodendrocytes (P=3.2×10−5).ConclusionBulk tissue analysis may be confounded by changes in cell type composition and may obscure effects within vulnerable cell types. This study serves as a biological reference for future single cell studies of toxicant or neuronal complications, to ultimately characterize the molecular basis by which Pb influences cognition and behavior.


2017 ◽  
Author(s):  
Giovanni Iacono ◽  
Elisabetta Mereu ◽  
Amy Guillaumet-Adkins ◽  
Roser Corominas ◽  
Ivon Cuscó ◽  
...  

AbstractSingle-cell RNA sequencing significantly deepened our insights into complex tissues and latest techniques are capable processing ten-thousands of cells simultaneously. With bigSCale, we provide an analytical framework being scalable to analyze millions of cells, addressing challenges of future large datasets. Unlike previous methods, bigSCale does not constrain data to fit an a priori-defined distribution and instead uses an accurate numerical model of noise. We evaluated the performance of bigSCale using a biological model of aberrant gene expression in patient derived neuronal progenitor cells and simulated datasets, which underlined its speed and accuracy in differential expression analysis. We further applied bigSCale to analyze 1.3 million cells from the mouse developing forebrain. Herein, we identified rare populations, such as Reelin positive Cajal-Retzius neurons, for which we determined a previously not recognized heterogeneity associated to distinct differentiation stages, spatial organization and cellular function. Together, bigSCale presents a perfect solution to address future challenges of large single-cell datasets.Extended AbstractSingle-cell RNA sequencing (scRNAseq) significantly deepened our insights into complex tissues by providing high-resolution phenotypes for individual cells. Recent microfluidic-based methods are scalable to ten-thousands of cells, enabling an unbiased sampling and comprehensive characterization without prior knowledge. Increasing cell numbers, however, generates extremely big datasets, which extends processing time and challenges computing resources. Current scRNAseq analysis tools are not designed to analyze datasets larger than from thousands of cells and often lack sensitivity and specificity to identify marker genes for cell populations or experimental conditions. With bigSCale, we provide an analytical framework for the sensitive detection of population markers and differentially expressed genes, being scalable to analyze millions of single cells. Unlike other methods that use simple or mixture probabilistic models with negative binomial, gamma or Poisson distributions to handle the noise and sparsity of scRNAseq data, bigSCale does not constrain the data to fit an a priori-defined distribution. Instead, bigSCale uses large sample sizes to estimate a highly accurate and comprehensive numerical model of noise and gene expression. The framework further includes modules for differential expression (DE) analysis, cell clustering and population marker identification. Moreover, a directed convolution strategy allows processing of extremely large data sets, while preserving the transcript information from individual cells.We evaluate the performance of bigSCale using a biological model for reduced or elevated gene expression levels. Specifically, we perform scRNAseq of 1,920 patient derived neuronal progenitor cells from Williams-Beuren and 7q11.23 microduplication syndrome patients, harboring a deletion or duplication of 7q11.23, respectively. The affected region contains 28 genes whose transcriptional levels vary in line with their allele frequency. BigSCale detects expression changes with respect to cells from a healthy donor and outperforms other methods for single-cell DE analysis in sensitivity. Simulated data sets, underline the performance of bigSCale in DE analysis as it is faster and more sensitive and specific than other methods. The probabilistic model of cell-distances within bigSCale is further suitable for unsupervised clustering and the identification of cell types and subpopulations. Using bigSCale, we identify all major cell types of the somatosensory cortex and hippocampus analyzing 3,005 cells from adult mouse brains. Remarkably, we increase the number of cell population specific marker genes 4-6-fold compared to the original analysis and, moreover, define markers of higher order cell types. These include CD90 (Thy1), a neuronal surface receptor, potentially suitable for isolating intact neurons from complex brain samples.To test its applicability for large data sets, we apply bigSCale on scRNAseq data from 1.3 million cells derived from the pallium of the mouse developing forebrain (E18, 10x Genomics). Our directed down-sampling strategy accumulates transcript counts from cells with similar transcriptional profiles into index cell transcriptomes, thereby defining cellular clusters with improved resolution. Accordingly, index cell clusters provide a rich resource of marker genes for the main brain cell types and less frequent subpopulations. Our analysis of rare populations includes poorly characterized developmental cell types, such as neuron progenitors from the subventricular zone and neocortical Reelin positive neurons known as Cajal-Retzius (CR) cells. The latter represent a transient population which regulates the laminar formation of the developing neocortex and whose malfunctioning causes major neurodevelopmental disorders like autism or schizophrenia. Most importantly, index cell cluster can be deconvoluted to individual cell level for targeted analysis of populations of interest. Through decomposition of Reelin positive neurons, we determined a previously not recognized heterogeneity among CR cells, which we could associate to distinct differentiation stages as well as spatial and functional differences in the developing mouse brain. Specifically, subtypes of CR cells identified by bigSCale express different compositions of NMDA, AMPA and glycine receptor subunits, pointing to subpopulations with distinct membrane properties. Furthermore, we found Cxcl12, a chemokine secreted by the meninges and regulating the tangential migration of CR cells, to be also expressed in CR cells located in the marginal zone of the neocortex, indicating a self-regulated migration capacity.Together, bigSCale presents a perfect solution for the processing and analysis of scRNAseq data from millions of single cells. Its speed and sensitivity makes it suitable to the address future challenges of large single-cell data sets.


Author(s):  
Isabella N. Grabski ◽  
Rafael A. Irizarry

AbstractSingle-cell RNA sequencing (scRNA-seq) quantifies gene expression for individual cells in a sample, which allows distinct cell-type populations to be identified and characterized. An important step in many scRNA-seq analysis pipelines is the annotation of cells into known cell-types. While this can be achieved using experimental techniques, such as fluorescence-activated cell sorting, these approaches are impractical for large numbers of cells. This motivates the development of data-driven cell-type annotation methods. We find limitations with current approaches due to the reliance on known marker genes or from overfitting because of systematic differences between studies or batch effects. Here, we present a statistical approach that leverages public datasets to combine information across thousands of genes, uses a latent variable model to define cell-type-specific barcodes and account for batch effect variation, and probabilistically annotates cell-type identity. The barcoding approach also provides a new way to discover marker genes. Using a range of datasets, including those generated to represent imperfect real-world reference data, we demonstrate that our approach substantially outperforms current reference-based methods, in particular when predicting across studies. Our approach also demonstrates that current approaches based on unsupervised clustering lead to false discoveries related to novel cell-types.


2016 ◽  
Vol 113 (30) ◽  
pp. 8508-8513 ◽  
Author(s):  
Megan Ealy ◽  
Daniel C. Ellwanger ◽  
Nina Kosaric ◽  
Andres P. Stapper ◽  
Stefan Heller

Efficient pluripotent stem cell guidance protocols for the production of human posterior cranial placodes such as the otic placode that gives rise to the inner ear do not exist. Here we use a systematic approach including defined monolayer culture, signaling modulation, and single-cell gene expression analysis to delineate a developmental trajectory for human otic lineage specification in vitro. We found that modulation of bone morphogenetic protein (BMP) and WNT signaling combined with FGF and retinoic acid treatments over the course of 18 days generates cell populations that develop chronological expression of marker genes of non-neural ectoderm, preplacodal ectoderm, and early otic lineage. Gene expression along this differentiation path is distinct from other lineages such as endoderm, mesendoderm, and neural ectoderm. Single-cell analysis exposed the heterogeneity of differentiating cells and allowed discrimination of non-neural ectoderm and otic lineage cells from off-target populations. Pseudotemporal ordering of human embryonic stem cell and induced pluripotent stem cell-derived single-cell gene expression profiles revealed an initially synchronous guidance toward non-neural ectoderm, followed by comparatively asynchronous occurrences of preplacodal and otic marker genes. Positive correlation of marker gene expression between both cell lines and resemblance to mouse embryonic day 10.5 otocyst cells implied reasonable robustness of the guidance protocol. Single-cell trajectory analysis further revealed that otic progenitor cell types are induced in monolayer cultures, but further development appears impeded, likely because of lack of a lineage-stabilizing microenvironment. Our results provide a framework for future exploration of stabilizing microenvironments for efficient differentiation of stem cell-generated human otic cell types.


Sign in / Sign up

Export Citation Format

Share Document