Identification of key genes associated with lung adenocarcinoma by bioinformatics analysis
Lung adenocarcinoma (LUAD) is the most common histological type of lung cancer, comprising around 40% of all lung cancer. Until now, the pathogenesis of LUAD has not been fully elucidated. In the current study, we comprehensively analyzed the dysregulated genes in lung adenocarcinoma by mining public datasets. Two sets of gene expression datasets were obtained from the Gene Expression Omnibus (GEO) database. The dysregulated genes were identified by using the GEO2R online tool, and analyzed by R packages, Cytoscape software, STRING, and GPEIA online tools. A total of 275 common dysregulated genes were identified in two independent datasets, including 54 common up-regulated and 221 common down-regulated genes in LUAD. Gene Ontology (GO) enrichment analysis showed that these dysregulated genes were significantly enriched in 258 biological processes (BPs), 27 cellular components (CCs), and 21 molecular functions (MFs). Furthermore, protein-protein interaction (PPI) network analysis showed that PECAM1, ENG, KLF4, CDH5, and VWF were key genes. Survival analysis indicated that the low expression of ENG was associated with poor overall survival (OS) of LUAD patients. The low expression of PECAM1 was associated with poor OS and recurrence-free survival of LUAD patients. The cox regression model developed based on age, tumor stage, ENG, PECAM1 could effectively predict 5-year survival of LUAD patients. This study revealed some key genes, BPs, CCs, and MFs involved in LUAD, which would provide new insights into understanding the pathogenesis of LUAD. In addition, ENG and PECAM1 might serve as promising prognostic markers in LUAD.