AbstractOne of the hallmarks of cancer is the existence of a high mutational load in driver genes, which is balanced by upregulation (downregulation) of DNA repair pathways, since almost complete DNA repair is required for mitosis. The prediction of cancer survival with gene expression has been investigated by many groups, however, results of a comprehensive re-evaluation of the original data adjusted by the PCNA metagene indicate that only a small proportion of genes are truly predictive of survival. However, little is known regarding the effect of the PCNA metagene on survival prediction specifically by DNA repair genes. We investigated prediction of overall survival (OS) in 18 cancers by using normalized RNA-Seq data for 126 DNA repair genes with expression available in TCGA. Transformations for normality and adjustments for age at diagnosis, stage, and PCNA metagene were performed for all DNA repair genes. We also analyzed genomic event rates (GER) for somatic mutations, deletions, and amplification in driver genes and DNA repair genes. After performing empirical p-value testing with use of randomly selected gene sets, it was observed that OS could be predicted significantly by sets of DNA repair genes for 61% (11/18) of the cancers. Pathway activation analysis indicates that in the presence of dysfunctional driver genes, the initial damage signaling and minor single-gene repair mechanisms may be abrogated, but with later pathway genes fully activated and intact. Neither PARP1 or PARP2 were significant predictors of survival for any of the 11 cancers. Results from cluster analysis of GERs indicates that the most opportunistic set of cancers warranting further study are AML, colorectal, and renal papillary, because of their lower GERs for mutations, deletions, and amplifications in DNA repair genes. However, the most opportunistic cancer to study is likely to be AML, since it showed the lowest GERs for mutations, deletions, and amplifications, suggesting that DNA repair pathway activation in AML is intact and unaltered genomically. In conclusion, our hypothesis-driven focus to target DNA repair gene expression adjusted for the PCNA metagene as a means of predicting OS in various cancers resulted in statistically significant sets of genes.Author summaryThe proliferating cell nuclear antigen (PCNA) protein is a homotrimer and activator of polymerase δ, which encircles DNA during transcription to recruit other proteins involved in replication and repair. In tumor cells, expression of PCNA is highly upregulated; however, PCNA-related activity is a normal process for DNA transcription in eukaryotes and therefore is not considered to play a central role in the selective genetic pressure associated with tumor development. Since PCNA is widely co-regulated with other genes in normal tissues, we developed workflow involving several functional transforms and regression models to “remove” the co-regulatory effect of PCNA on expression of DNA repair genes, and predicted overall cancer survival using DNA repair gene expression with and without removal of the PCNA effect. Other adjustments to survival prediction were employed, such as subject age at diagnosis and tumor stage. Random selection of gene sets was also employed for empirical p-value testing to determine the strength of the PCNA effect on DNA repair and overall survival adjustments. Since TCGA RNA-Seq data were used, we also characterized the frequency of deletions, amplifications, and somatic mutations in the DNA repair genes considered in order to observe which genomic events are the most frequent for the cancers evaluated.