Sources of Variation in Cell-Type RNA-Seq Profiles

Mapping Intimacies ◽

10.21203/rs.2.23415/v2 ◽

2020 ◽

Author(s):

Johan Gustafsson ◽

Felix Held ◽

Jonathan Robinson ◽

Elias Björnson ◽

Rebecka Jörnsten ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Specific Gene ◽

Rna Seq ◽

Cell Type ◽

Specific Gene Expression ◽

Cell Type Specific ◽

Technical Factors

Abstract Cell-type specific gene expression profiles are needed for many computational methods operating on bulk RNA-Seq samples, such as deconvolution of cell-type fractions and digital cytometry. However, the gene expression profile of a cell type can vary substantially due to both technical factors and biological differences in cell state and surroundings, reducing the efficacy of such methods. Here, we investigated which factors contribute most to this variation. We evaluated different normalization methods, quantified the variance explained by different factors, evaluated the effect on deconvolution of cell type fractions, and examined the differences between UMI-based single-cell RNA-Seq and bulk RNA-Seq. We investigated a collection of publicly available bulk and single-cell RNA-Seq datasets containing B and T cells, and found that the technical variation across laboratories is substantial, even for genes specifically selected for deconvolution, and has a confounding effect on deconvolution. Tissue of origin is also a substantial factor, highlighting the challenge of applying cell type profiles derived from blood on mixtures from other tissues. We also show that much of the differences between UMI-based single-cell and bulk RNA-Seq methods can be explained by the number of read duplicates per mRNA molecule in the single-cell sample. Our work shows the importance of either matching or correcting for technical factors when creating cell-type specific gene expression profiles that are to be used together with bulk samples.

Download Full-text

Sources of Variation in Cell-Type RNA-Seq Profiles

10.21203/rs.2.23415/v1 ◽

2020 ◽

Cited By ~ 1

Author(s):

Johan Gustafsson ◽

Felix Held ◽

Jonathan Robinson ◽

Elias Björnson ◽

Rebecka Jörnsten ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Specific Gene ◽

Rna Seq ◽

Cell Type ◽

Specific Gene Expression ◽

Cell Type Specific ◽

Technical Factors

Abstract Background Cell-type specific gene expression profiles are needed for many computational methods operating on bulk RNA-Seq samples, such as deconvolution of cell-type fractions and digital cytometry. However, the gene expression profile of a cell type can vary substantially due to both technical factors and biological differences in cell state and surroundings, reducing the efficacy of such methods. Here, we investigated which factors contribute most to this variation. Results We evaluated different normalization methods, quantified the magnitude of variation introduced by different sources, and examined the differences between UMI-based single-cell RNA-Seq and bulk RNA-Seq. We applied methods such as random forest regression to a collection of publicly available bulk and single-cell RNA-Seq datasets containing B and T cells, and found that the technical variation across laboratories is of the same magnitude as the biological variation across cell types. Tissue of origin and cell subtype are less important but still substantial factors, while the difference between individuals is relatively small. We also show that much of the differences between UMI-based single-cell and bulk RNA-Seq methods can be explained by the number of read duplicates per mRNA molecule in the single-cell sample.Conclusions Our work shows the importance of either matching or correcting for technical factors when creating cell-type specific gene expression profiles that are to be used together with bulk samples.

Download Full-text

A novel computational complete deconvolution method using RNA-seq data

10.1101/496596 ◽

2018 ◽

Cited By ~ 2

Author(s):

Kai Kang ◽

Qian Meng ◽

Igor Shats ◽

David M. Umbach ◽

Melissa Li ◽

...

Keyword(s):

Gene Expression ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Specific Gene ◽

Rna Seq ◽

Cell Type ◽

Specific Gene Expression ◽

Cell Type Composition ◽

Type Composition ◽

Cell Type Specific

AbstractThe cell type composition of many biological tissues varies widely across samples. Such sample heterogeneity hampers efforts to probe the role of each cell type in the tissue microenvironment. Current approaches that address this issue have drawbacks. Cell sorting or single-cell based experimental techniques disrupt in situ interactions and alter physiological status of cells in tissues. Computational methods are flexible and promising; but they often estimate either sample-specific proportions of each cell type or cell-type-specific gene expression profiles, not both, by requiring the other as input. We introduce a computational Complete Deconvolution method that can estimate both sample-specific proportions of each cell type and cell-type-specific gene expression profiles simultaneously using bulk RNA-Seq data only (CDSeq). We assessed our method’s performance using several synthetic and experimental mixtures of varied but known cell type composition and compared its performance to the performance of two state-of-the art deconvolution methods on the same mixtures. The results showed CDSeq can estimate both sample-specific proportions of each component cell type and cell-typespecificgene expression profiles with high accuracy. CDSeq holds promise for computationally deciphering complex mixtures of cell types, each with differing expression profiles, using RNA-seq data measured in bulk tissue (MATLAB code is available at https://github.com/kkang7/CDSeq_011).

Download Full-text

SCDC: bulk gene expression deconvolution by multiple single-cell RNA sequencing references

Briefings in Bioinformatics ◽

10.1093/bib/bbz166 ◽

2020 ◽

Cited By ~ 13

Author(s):

Meichen Dong ◽

Aatish Thennavan ◽

Eugene Urrutia ◽

Yun Li ◽

Charles M Perou ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Rna Sequencing ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Specific Gene ◽

Rna Seq ◽

Cell Type ◽

Mixed Cell ◽

Single Cell Rna Sequencing

Abstract Recent advances in single-cell RNA sequencing (scRNA-seq) enable characterization of transcriptomic profiles with single-cell resolution and circumvent averaging artifacts associated with traditional bulk RNA sequencing (RNA-seq) data. Here, we propose SCDC, a deconvolution method for bulk RNA-seq that leverages cell-type specific gene expression profiles from multiple scRNA-seq reference datasets. SCDC adopts an ENSEMBLE method to integrate deconvolution results from different scRNA-seq datasets that are produced in different laboratories and at different times, implicitly addressing the problem of batch-effect confounding. SCDC is benchmarked against existing methods using both in silico generated pseudo-bulk samples and experimentally mixed cell lines, whose known cell-type compositions serve as ground truths. We show that SCDC outperforms existing methods with improved accuracy of cell-type decomposition under both settings. To illustrate how the ENSEMBLE framework performs in complex tissues under different scenarios, we further apply our method to a human pancreatic islet dataset and a mouse mammary gland dataset. SCDC returns results that are more consistent with experimental designs and that reproduce more significant associations between cell-type proportions and measured phenotypes.

Download Full-text

Digital sorting of complex tissues for cell type-specific gene expression profiles

BMC Bioinformatics ◽

10.1186/1471-2105-14-89 ◽

2013 ◽

Vol 14 (1) ◽

pp. 89 ◽

Cited By ~ 108

Author(s):

Yi Zhong ◽

Ying-Wooi Wan ◽

Kaifang Pang ◽

Lionel ML Chow ◽

Zhandong Liu

Keyword(s):

Gene Expression ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Specific Gene ◽

Cell Type ◽

Specific Gene Expression ◽

Cell Type Specific

Download Full-text

SCDC: Bulk Gene Expression Deconvolution by Multiple Single-Cell RNA Sequencing References

10.1101/743591 ◽

2019 ◽

Cited By ~ 1

Author(s):

Meichen Dong ◽

Aatish Thennavan ◽

Eugene Urrutia ◽

Yun Li ◽

Charles M. Perou ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Rna Sequencing ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Specific Gene ◽

Rna Seq ◽

Cell Type ◽

Mixed Cell ◽

Single Cell Rna Sequencing

AbstractRecent advances in single-cell RNA sequencing (scRNA-seq) enable characterization of transcriptomic profiles with single-cell resolution and circumvent averaging artifacts associated with traditional bulk RNA sequencing (RNA-seq) data. Here, we propose SCDC, a deconvolution method for bulk RNA-seq that leverages cell-type specific gene expression profiles from multiple scRNA-seq reference datasets. SCDC adopts an ENSEMBLE method to integrate deconvolution results from different scRNA-seq datasets that are produced in different laboratories and at different times, implicitly addressing the problem of batch-effect confounding. SCDC is benchmarked against existing methods using both in silico generated pseudo-bulk samples and experimentally mixed cell lines, whose known cell-type compositions serve as ground truths. We show that SCDC outperforms existing methods with improved accuracy of cell-type decomposition under both settings. To illustrate how the ENSEMBLE framework performs in complex tissues under different scenarios, we further apply our method to a human pancreatic islet dataset and a mouse mammary gland dataset. SCDC returns results that are more consistent with experimental designs and that reproduce more significant associations between cell-type proportions and measured phenotypes.

Download Full-text

Methods to analyze cell type-specific gene expression profiles from heterogeneous cell populations

Animal Cells and Systems ◽

10.1080/19768354.2016.1191544 ◽

2016 ◽

Vol 20 (3) ◽

pp. 113-117 ◽

Cited By ~ 5

Author(s):

Jane Jung ◽

Hosung Jung

Keyword(s):

Gene Expression ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Cell Populations ◽

Specific Gene ◽

Cell Type ◽

Specific Gene Expression ◽

Cell Type Specific

Download Full-text

Differential sensitivities of transcription factor target genes underlie cell type-specific gene expression profiles

Blood Cells Molecules and Diseases ◽

10.1016/j.bcmd.2006.10.068 ◽

2007 ◽

Vol 38 (2) ◽

pp. 147-148

Author(s):

Kirby D. Johnson ◽

Shin-Il Kim ◽

Megan E. Boyer ◽

Emery H. Bresnick

Keyword(s):

Gene Expression ◽

Transcription Factor ◽

Target Genes ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Specific Gene ◽

Cell Type ◽

Specific Gene Expression ◽

Transcription Factor Target ◽

Cell Type Specific

Download Full-text

Cell-type specific gene expression profiles of leukocytes in human peripheral blood

BMC Genomics ◽

10.1186/1471-2164-7-115 ◽

2006 ◽

Vol 7 (1) ◽

Cited By ~ 198

Author(s):

Chana Palmer ◽

Maximilian Diehn ◽

Ash A Alizadeh ◽

Patrick O Brown

Keyword(s):

Gene Expression ◽

Peripheral Blood ◽

Human Peripheral Blood ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Specific Gene ◽

Cell Type ◽

Specific Gene Expression ◽

Cell Type Specific

Download Full-text

Differential sensitivities of transcription factor target genes underlie cell type-specific gene expression profiles

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.0604041103 ◽

2006 ◽

Vol 103 (43) ◽

pp. 15939-15944 ◽

Cited By ~ 33

Author(s):

K. D. Johnson ◽

S.-I. Kim ◽

E. H. Bresnick

Keyword(s):

Gene Expression ◽

Transcription Factor ◽

Target Genes ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Specific Gene ◽

Cell Type ◽

Specific Gene Expression ◽

Transcription Factor Target ◽

Cell Type Specific

Download Full-text

Deep autoencoder enables interpretable tissue-adaptive deconvolution and cell-type-specific gene analysis

10.1101/2021.10.26.465846 ◽

2021 ◽

Author(s):

Yanshuo Chen ◽

Yixuan Wang ◽

Yuelong Chen ◽

Yumeng Wei ◽

Yunxiang Li ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Clinical Data ◽

Gene Set Enrichment Analysis ◽

Specific Gene ◽

Rna Seq ◽

Cell Type ◽

Specific Gene Expression ◽

Wide Range ◽

Cell Type Specific

AbstractSingle-cell RNA-seq has become a powerful tool for researchers to study biologically significant characteristics at explicitly high resolution, but its application on emerging data is currently limited by its intrinsic techniques. Here, we introduce TAPE, a deep learning method that connects bulk RNA-seq and single-cell RNA-seq to balance the demands of big data and precision. By taking advantage of constructing an interpretable decoder and training under a unique scheme, TAPE can predict cell-type fractions and cell-type-specific gene expression tissue-adaptively. Compared with existing methods on several benchmarking datasets, TAPE is more accurate (up to 40% performnace improvement on the real bulk data) and faster than the previous methods. It is sensitive enough to provide biologically meaningful predictions. For example, only TAPE can predict the tendency of increasing monocytes-to-lymphocytes (MLR) ratio in COVID-19 patients from mild to serious symptoms, whose estimated indices are consistent with laboratory data. More importantly, through the analysis of clinical data, TAPE shows its ability to predict cell-type-specific gene expression profiles with biological significance. Combining with single-sample gene set enrichment analysis (ssGSEA), TAPE also provides valuable clues for people to investigate the immune response in different virus-infected patients. We believe that TAPE will enable and accelerate the precise analysis of high-throughput clinical data in a wide range.

Download Full-text