Comprehensive transcriptomic analysis of cell lines as models of primary tumor samples across 22 tumor types
AbstractCancer cell lines are commonly used as models for cancer biology. While they are limited in their ability to capture complex interactions between tumors and their surrounding environment, they are a cornerstone of cancer research and many important findings have been discovered utilizing cell line models. Not all cell lines are appropriate models of primary tumors, however, which may contribute to the difficulty in translating in vitro findings to patients. Previous studies have leveraged public datasets to evaluate cell lines as models of primary tumors, but they have been limited in scope to specific tumor types and typically ignore the presence of tumor infiltrating cells in the primary tumor samples. We present here a comprehensive pan-cancer analysis utilizing approximately 9,000 transcriptomic profiles from The Cancer Genome Atlas and the Cancer Cell Line Encyclopedia to evaluate cell lines as models of primary tumors across 22 different tumor types. After adjusting for tumor purity in the primary tumor samples, we performed correlation analysis and differential gene expression analysis between the primary tumor samples and cell lines. We found that cell-cycle pathways are consistently upregulated in cell lines, while no pathways are consistently upregulated across the primary tumor samples. In a case study, we compared colorectal cancer cell lines with primary tumor samples across the colorectal subtypes and identified three colorectal cell lines that were derived from fibroblasts rather than tumor epithelial cells. Lastly, we propose a new set of cell lines panel, the TCGA-110, which contains the most representative cell lines from 22 different tumor types as a more comprehensive and informative alternative to the NCI-60 panel. Our analysis of the other tumor types are available in our web app (http://comphealth.ucsf.edu/TCGA110) as a resource to the cancer research community, and we hope it will allow researchers to select more appropriate cell line models and increase the translatability of in vitro findings.