Estimating Intraclonal Heterogeneity and Subpopulation Changes from Perturbational Bulk Gene Expression Profiles in LINCS L1000 CMap by Premnas
The connectivity among signatures upon perturbations curated in the CMap library provides a valuable resource for understanding therapeutic pathways and biological processes associated with the drugs and diseases. However, due to the nature of bulk-level expression profiling by the L1000 assay, intraclonal heterogeneity and subpopulation compositional change that could contribute to the responses to perturbations are largely neglected, hampering the interpretability and reproducibility of the connections. In this work, we proposed a computational framework, Premnas, to estimate the abundance of undetermined subpopulations from L1000 profiles in CMap directly according to an ad hoc subpopulation representation learned from a well-normalized batch of single-cell RNA-seq datasets by the archetypal analysis. By recovering the information of subpopulation changes upon perturbation, the potentials of searching for drug cocktails and drug-resistant/susceptible subpopulations with CMap L1000 were further explored and examined. The proposed framework enables a new perspective to understand the connectivity among cellular signatures and expands the scope of the CMAP and other similar perturbation datasets limited by the bulk profiling technology. The executable and source code of Premnas is freely available at https://github.com/jhhung/Premnas.