Chao Huang,Ce-Gui Hu,Zhi-Kun Ning,Jun Huang,Zheng-Ming Zhu,Department of Gastrointestinal Surgery,The Second Affiliated Hospital of Nanchang University,Nanchang 330006,Jiangxi Province,China
Abstract BACKGROUND Self-renewal of gastric cancer stem cells (GCSCs) is considered to be the underlying cause of the metastasis,drug resistance,and recurrence of gastric cancer (GC).AIM To characterize the expression of stem cell-related genes in GC.METHODS RNA sequencing results and clinical data for gastric adenoma and adenocarcinoma samples were obtained from The Cancer Genome Atlas database,and the results of the GC mRNA expression-based stemness index (mRNAsi)were analyzed.Weighted gene coexpression network analysis was then used to find modules of interest and their key genes.Survival analysis of key genes was performed using the online tool Kaplan-Meier Plotter,and the online database Oncomine was used to assess the expression of key genes in GC.RESULTS mRNAsi was signific antly upregulated in GC tissues compared to normal gastric tissues (P <0.0001).A total of 16 modules were obtained from the gene coexpression network;the brown module was mo st positively correlated with mRNAsi.Sixteen key genes (BUB1,BUB1B,NCAPH,KIF14,RACGAP1,RAD54L,TPX2,KIF 15,KIF18B,CENPF,TTK,KIF4A,SGOL2,PLK4,XRCC2,a n d C1orf112)were identified in the brown m odule.The functional and pathway enrichment analyses showed that the key genes were significantly enriched in the spindle cellular component,the sister chromatid segregation biological process,the moto r activity molecular function,and the cell cycle and homologous recombination pathways.Survival analysis and Oncomine analysis revealed that the prognosis of patients with GC and the expression of three genes (RAD54L,TPX2, and XRCC2)were consistently related.CONCLUSION Sixteen key genes are primarily associated with stem cell self-renewal and cell proliferation characteristics.RAD54L,TPX2,and XRCC2 are the most likely therapeutic targets for inhibiting the stemness characteristics of GC cells.
Key Words:Gastric cancer;Cancer stem cell;Key gene;The Cancer Genome Atlas database;Weighted gene coexpression network analysis;mRNA expression-based stemness index
Gastric cancer (GC) is a common malignant tumor,and its mortality rate ranks third among cancers globally[1].In China,GC is also the major cause of cancer-related death,and patients with advanced disease have lower survival rates and higher recurrence rates[2].Cancer stem cells (CSCs),defined functionally rather than by cellular origin,have superior tumour initiation,growth,and metastatic potential than other tumor cells[3].CSCs may be derived from normal stem cells that have acquired malignant mutations and have lost the ability to self-regulate cell proliferation[4].CSCs have a strong ability of self-renewal and expand in a symmetrical splitting manner to excessively increase cell growth,ultimately leading to tumor formation[5,6].CSCs can lead to cancer recurrence,metastasis,multidrug resistance,and radiation resistance by blocking G0 phase,resulting in new tumors[7].CSCs can effectively protect cancer cells from apoptosis by activating DNA repair capabilities[8].The CSC model proposes that tumor growth is driven by a small number of self-sustaining cells that have longevity,infinite proliferation,and an ability to differentiate into the entire heterogeneous population of the tumour[9].
Self-renewal of GC stem cells (GCSCs) is considered to be the underlying cause of the metastasis,drug resistance,and recurrence of GC[10,11].In vitrostudies with cultured GCSCs have found that these cells are more resistant to chemotherapy and radiotherapy[12,13],which may be due to the high expression of anti-apoptotic proteins,the improvement of DNA repair efficiency,and the alterations in cell cycle kinetics[9,14].The identification and targeted therapy of cancer stem cells are of great significance in the treatment of GC[15].The mRNA expression-based stemness index(mRNAsi),which can be used as a quantitative representation of cancer cell stemness,is an indicator describing the degree of similarity between tumor cells and stem cells.Moreover,inhibiting DNA methyltransferase can reduce the accumulation of tumorigenic ability of GCSCs[16].The transcriptomes and methylomes were analyzed on multiple platforms to quantify stemness,and the DNA methylation based stemness index (mDNAsi) and mRNAsi were obtained;this analysis was also applied to The Cancer Genome Atlas (TCGA) dataset to calculate their scores of those samples[17].Weighted gene coexpression network analysis (WGCNA) determines the correlation between genes based on systematic biological methods;it is used not only to build gene networks and detect modules but also to identify key genes[18,19].WGCNA has been widely used to screen key genes related to the clinical characteristics of many tumour types,such as pancreatic cancer and kidney cancer[20,21].In addition,the genes that regulate the maintenance and proliferation of GCSCs remain unknown.This study aimed to identify key genes related to stemness by combining WGCNA with GC mRNAsi in TCGA,thus providing new ideas for the treatment of GC.
RNA sequencing results of 373 tissues and 348 human gastric adenomas and adenocarcinoma samples were obtained from TCGA database (https://portal.gdc.cancer.gov).The RNA-seq results of 30 normal samples and 343 cancer samples were merged into a matrix file using a script in the Perl language (http://www.perl.org/).We then converted the Ensembl ID in the matrix file to the gene name using the Ensembl database (http://www.ensembl.org/index.html) and the Perl language script.In addition,406 pieces of clinical data were downloaded,and the relevant clinical data were collated and extracted using scripts in the Perl language.The calculation of mRNAsi was performed from the molecular spectrum of normal cells with different degrees of stemness[17].
The statistical software R (version 3.6.0,https://www.r-project.org/),limma package,and Wilcoxon test were used to screen differentially expressed gene (DEG) expression data between normal gastric tissue and GC samples.The values of the genes with the same name were averaged and the genes with expression levels <0.2 were deleted.The false discovery rate (FDR) <0.05 and |log2fold change| >1 were used as screening criteria.
The WGCNA R package was used to perform the WGCNA.The normal sample was deleted first,and the data were checked next for missing values.If a value was missing,the data was deleted.First,the similarity matrix was constructed by calculating the correlation of all genes.Second,the WGCNA software package was used to select appropriate soft-thresholding power β to improve coexpression similarity and achieve a scale-free topology.Third,the adjacency of genes was transformed into a topological overlap matrix (TOM),and then the corresponding dissimilarity was calculated.Average linkage hierarchical clustering was conducted with TOM-based dissimilarity measurements,and the minimum size of the gene dendrogram was 30.Finally,the dissimilarity of the genes was calculated and module dendrograms were built.
Gene significance (GS) refers to the correlation between genes and sample traits.Statistical significance was determined using the associatedPvalues.To reduce the number of modules,we set a threshold (<0.25) to merge some modules that were highly similar.The gene modules were then analyzed in combination with the clinical phenotype.Module membership (MM) refers to the correlation between a module’s own genes and the gene expression profile.After selecting the module of interest,GS and MM for each key gene were calculated.We defined cor.gene MM >0.8 and cor.gene GS >0.5 as the threshold for the key genes of the module.
We used the ggpubr package of the R language to draw a box plot of differential gene expression to understand the differential expression of key genes.The pheatmap package was used to create heat maps of key genes.Correlation analysis of key genes was performed using the corrplot package.
To explore the interaction between key genes,we imported the key genes into the Search Tool for the Retrieval of Interacting Genes (STRING,http://string.embl.de/)database[22].The protein-protein interaction (PPI) network of key genes,which included genes with a combined score of >0.4,was constructed using STRING.
To study the biological functions of the module genes and key genes,we used the clusterProfiler package for gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses[23].P<0.05 and FDR <0.05 were used as thresholds for identifying significant GO terms and pathways.A histogram of GO enrichment results for the key genes was plotted using the ggplot2 package.GO includes cellular component (CC),biological process (BP),and molecular function(MF).
To understand the relationship between the expression of key genes and prognosis,we used the online tool Kaplan–Meier Plotter (http://kmplot.com/) for survival analysis.A log-rankPvalue <0.05 was considered statistically significant.In addition,we used the online database Oncomine (http://www.oncomine.com) to assess the expression of key genes in GC.
The Wilcoxon test showed a significant difference in the mRNAsi of normal gastric tissue and that of gastric tumor tissue (P<0.0001,Figure1A).A total of 6685 DEGs were identified by Wilcoxon test analysis,among which 5479 were upregulated and 1206 downregulated.The 20 with the most significant upregulation and the 20 with the most significant downregulated were mapped using the pheatmap package(Figure1B).
To identify biologically important gene modules and better understand genes associated with gastric cancer stemness,we used the WGCNA package to construct a gene coexpression network.Ultimately,we obtained 16 modules (Figure2A);the brown module was most positively correlated with mRNAsi with a correlation coefficient of 0.75.In addition,the turquoise module exhibited a high negative correlation with mRNAsi with a correlation coefficient of -0.77 (Figure2B).Therefore,the brown module was used as the most interesting module,so we further analyzed this module.
According to the threshold of key gene screening,we screened 16 key genes in the mRNAsi group,namely,BUB1,BUB1B,NCAPH,KIF14,RACGAP1,RAD54L,TPX2,KIF15,KIF18B,CENPF,TTK,KIF4A,SGOL2,PLK4,XRCC2,andC1orf112(Figure2C).
The expression of key genes differed significantly between normal gastric tissue and tumor tissue.The expression of key genes in gastric tumors was significantly higher than that in normal gastric tissues (P<0.001 for all,Figure3A).There was a clear correlation between the expressions of key genes,andCENPFhad the highest correlation withKIF14(Figure3B).
The PPI network of key genes consisted of 16 nodes and 94 edges,including 16 upregulated genes (Figure3C).In the PPI network,TTK,TPX2,NCAPH,KIF15,CENPF,andBUB1all had 14 connection nodes (Figure3D).
GO and KEGG pathway enrichment analyses were performed using the clusterProfiler package.The most significantly enriched GO terms were spindle (ontology:CC),sister chromatid segregation (ontology:BP),and motor activity (ontology:MF) (Figure4).The significantly enriched KEGG pathways have cell cycle and homologous recombination (Table1).
My mother cautiously picked her way across the grassless yard and approached the steps. Slowly, she laid the bags down and knocked. She returned to the car and was about to drive away when the rusty trailer door slammed open. A woman stepped out, looking angry and confused.
Survival analysis of key genes was performed using Kaplan-Meier curves,whichshowed that all key genes exceptKIF14andKIF18Bwere related to prognosis.GC patients with high expression ofRAD54L,TPX2,andXRCC2had shorter survival times,while GC patients with high expression of other key genes had longer survival times (Figure5).However,differential analysis showed that 16 key genes were highly expressed in GC tissues.We further analyzed 16 key genes using the online database Oncomine,which showed that 16 key genes were highly expressed in GC tissues(Figure6).
Table1 Significantly enriched Kyoto Encyclopedia of Genes and Genomes pathways
Figure1 Differential analysis of mRNA expression-based stemness index and screening of differentially expressed genes.A:Expression of mRNA expression-based stemness index in gastric tumor tissues and normal gastric tissues;B:Significantly differentially expressed genes in gastric tumor tissues and normal gastric tissues.
Figure2 Weighted gene coexpression network analysis.A:Clustering dendrograms of differentially expressed genes.Each piece of the leaves on the cluster dendrogram corresponds to a gene.The first color band indicates the modules detected by dynamic tree cut methods.The second color band indicates the modules after merging similar modules;B:Correlation between the gene module and mRNA expression-based stemness index (mRNAsi).The correlation coefficient in each cell represents the correlation between the gene module and mRNAsi,which increases in size from blue to red.The corresponding P value is also annotated;C:Scatter plot of module eigengenes in brown.The top right corner is where the key genes are located.
Metastasis,drug resistance,and recurrence are the main causes of the low survival rate of patients with GC.Self-renewal of GCSCs is considered to be the underlying cause of the metastasis,drug resistance,and recurrence of GC[10,11].After conventional surgical treatment or adjuvant chemotherapy,the number of GCSCs does not decrease but stemness is enriched,leading to metastasis,drug resistance,and recurrence of GC[10].Moreover,Bekaii-Saabet al[15]indicated that identification and targeted therapy of cancer stem cells had important significance in the treatment of GC.Therefore,targeted therapy of GCSCs is crucial.This study used WGCNA based on the mRNAsi score to identify key genes related to GCSC characteristics.This study found that the mRNAsi of gastric tumor tissue was significantly higher than that of normal gastric tissue.In addition,we used the WGCNA package to construct a gene coexpression network.Finally,we obtained a total of 16 modules;the brown module was most positively correlated with mRNAsi,and the turquoise module showed a high negative correlation with mRNAsi.Therefore,we selected the brown module as the module of most interest and identified 16 key genes that are associated with the characteris-tics of GCSCs:BUB1,BUB1B,NCAPH,KIF14,RACGAP1,RAD54L,TPX2,KIF15,KIF18B,CENPF,TTK,KIF4A,SGOL2,PLK4,XRCC2,andC1orf112.These genes were upregulated in GC and had a certain correlation.The most significantly enriched GO terms of these key genes were the spindle cellular component,the sister chromatid segregation biological process,and the motor activity molecular function.The significantly enriched KEGG pathways were cell cycle and homologous recombination.Except forKIF14andKIF18B,the expression of all key genes was related to the prognosis of patients with GC.
Figure3 Analysis of key genes.A:Expression of the key genes in gastric tumor tissues and normal gastric tissues.The red histogram indicates gastric tumor tissue.The blue histogram indicates normal gastric tissue.cP <0.001;B:Correlation of key genes.The number in the box represents the size of the correlation,and the larger the number,the higher the correlation;C:Protein-protein interaction networks of key genes.The color of the lines represents the source of the evidence;D:Connection nodes of key genes.The number represents the number of connection nodes.
BUB1encodes a protein that is essential for the function of the mitotic spindle check-point[24].Grabschet al[25]showed thatBUB1was overexpressed in up to 80% of GCs and was associated with tumor cell proliferation.Meanwhile,related studies indicated that high expression ofBUB1Bwas related to the invasion,lymph node metastasis,liver metastasis,and recurrence of GC[26].Kinesins are a family of molecular motors that play important roles in intracellular transport and cell division[27].Previous studies demonstrated thatKIF14can enhance tumor adhesion,invasion,and chemical resistance during tumor development[28].Tonget al[29]also revealed thatKIF14was overexpressed in GC and that this was related to tumor progression,invasion,and metastasis.Spheroid colony formation is an effective model for the characterization of cancer stem cells[11].Oueet al[30]showed thatKIF15andKIF4Awere upregulated in GC spheroid cells.A previous study also demonstrated that patients with high expression ofKIF14mRNA had a higher risk of malignancy,recurrence,and metastasis than those without high expression ofKIF14mRNA[31].Overexpression ofKIF18Bincreases the proliferation of hepatocellular carcinoma cells,and downregulation ofKIF18B in vitrocan inhibit the migration and invasion of cervical cancer cells[32].KIF4Ais considered to be an oncogene,indicating its involvement in malignancy,and its expression was indeed related to the occurrence,development,and metastasis of GC[33].Rac GTPase accelerating protein1 (RacGAP1),which belongs to the GTPase activation protein family,not only induces cytokinesis but also interferes with mitotic spindle assembly,thus participating in the regulation of cell proliferation[34].Studies have indicated thatRacGAP1is involved in cell transformation,movement,migration,and metastasis[35]and is positively correlated with the proliferation marker Ki67[36].In addition,RacGAP1is highly expressed in GC and significantly associated with tumor progression and poor prognosis[37].CENPFis a member of the mitochondrial family and regulates the proliferation of various tumor cells[38].Chenet al[39]showed thatCENPFwas overexpressed in GC and promoted its proliferation and metastasis.TTKis also known as a mitotic kinase[40],which is highly expressed in many types of tumors and promotes the proliferation of cancer cells[41,42].Parket al[43]found thatTTKplays an indispensable role in the development and maintenance of tumor stem cells.PLK4is a major regulator of centromeric repeats,and its overexpression in somatic cells can disrupt spindle formation,while its depletion can inhibit cell proliferation[44].Shinmuraet al[45]revealed thatPLK4is overexpressed in GC and plays an important role in cell proliferation,tumorigenesis,invasion,and drug resistance.C1ORF112is overexpressed in various tumors,including GC,and plays an important role in the growth of cancer cells,but it is still unclear whetherC1ORF112is particularly necessary for the proliferation of cancer cells.Condensins are multiprotein complexes that play a central role in chromosome assembly and segregation during mitosis and meiosis[46].NCAPHis one of the three non-SMC subunits of condensation I.Studies have indicated that non-SMC condensin I complex subunit G is overexpressed in colon cancer and hepatocellular carcinoma and promotes the proliferation and migration of tumor cells[47].
In addition,survival analysis and Oncomine analysis revealed that the prognosis of patients with GC and the expression of three genes (RAD54L,TPX2,andXRCC2) were consistently related.XRCC2plays a crucial role in DNA repair and chromosome arrangement,and its dysfunction may lead to tumor development and progression[48,49].Wanget al[50]found thatXRCC2is overexpressed in GC.RAD54LandXRCC2are homologous recombination factors that are overexpressed in lung cancer and GC,and there is a significant correlation between the expression levels of these two proteins[51].The inactivation of these factors leads to an obvious deficiency in homologous recombination and DNA repair in all eukaryotes[52].In addition,related studies indicated that abnormal function of homologous recombination leads to genomic instability,resulting in the occurrence of cancer[53,54].In this study,KEGGenrichment analysis indicated thatRAD54LandXRCC2were significantly enriched in homologous recombination.TPX2is a microtubule-associated protein that activates the cell cycle kinase protein Aurora-A and then plays an important role in the formation of spindles in mitosis[55].Lianget al[56]showed thatTPX2was overexpressed in GC and was related to poor progression and prognosis of GC.
Figure4 Significantly enriched gene ontology biological process terms.The larger the circle,the more number of enrichment,and the redder the color,the more significant the difference.BP:Biological process;CC:Cell component;MF:Molecular function.
In conclusion,this study found that the 16 key genes identified play an important role in the maintenance of GCSCs.In addition,RAD54L,TPX2,andXRCC2are most likely to be therapeutic targets for inhibiting the stemness characteristics of GC,providing a new idea for the treatment of GC.However,further research is needed to verify this hypothesis.
Figure5 Survival curves of key genes using the online tool Kaplan-Meier Plotter.The red indicates high expression and black indicates low expression.A log-rank P <0.05 was considered statistically significant.
Figure6 Expression of key genes in gastric cancer based on the Oncomine online database.The key genes were highly expressed in gastric cancer tissues.
Gastric cancer (GC) stem cells are the primary cause of GC metastasis and drug resistance.
The purpose of this study was to characterize the expression of stem cell-related genes in GC.
RNA sequencing results and clinical data for gastric adenoma and adeno-carcinoma samples were obtained from The Cancer Genome Atlas (TCGA) database,and the results of the GC mRNA expression-based stemness index (mRNAsi) were analyzed.Weighted gene coexpression network analysis (WGCNA) was then used to find modules of interest and their key genes.Survival analysis of key genes was performed using the online tool Kaplan-Meier Plotter,and the online database Oncomine was used to assess the expression of key genes in GC.
mRNAsi was significantly upregulated in GC tissues compared to normal gastric tissues (P<0.0001).A total of 16 modules were obtained from the gene coexpression network;the brown module was most positively correlated with mRNAsi.Sixteen key genes (BUB1,BUB1B,NCAPH,KIF14,RACGAP1,RAD54L,TPX2,KIF15,KIF18B,CENPF,TTK,KIF4A,SGOL2,PLK4,XRCC2,andC1orf112) were identified in the brown module.
RAD54L,TPX2,andXRCC2are the most positively correlated with mRNAsi and are the most likely therapeutic targets for inhibiting the stemness characteristics of gastric cancer cells.
This study aimed to identify key genes related to stemness by combining WGCNA with GC mRNAsi in TCGA,thus providing new ideas for the treatment of GC.
World Journal of Gastrointestinal Surgery2020年11期