Background: Hepatocellular carcinoma (HCC) has a high incidence and mortality worldwide, which seriously threatens people's physical and mental health. Coagulation is closely related to the occurrence and development of HCC. Whether coagulation-related genes (CRGs) can be used as prognostic markers for HCC remains to be investigated. Methods : Firstly, we identified differentially expressed coagulation-related genes of HCC and control samples in the datasets GSE54236, GSE102079, TCGA-LIHC, and Genecards database. Then, univariate Cox regression analysis, LASSO regression analysis, and multivariate Cox regression analysis were used to determine the key CRGs and establish the coagulation-related risk score (CRRS) prognostic model in the TCGA-LIHC dataset. The predictive capability of the CRRS model was evaluated by Kaplan–Meier survival analysis and ROC analysis. External validation was performed in the ICGC-LIRI-JP dataset. Besides, combining risk score and age, gender, grade, and stage, a nomogram was constructed to quantify the survival probability. We further analyzed the correlation between risk score and functional enrichment, pathway, and tumor immune microenvironment. Results: We identified 5 key CRGs (FLVCR1, CENPE, LCAT, CYP2C9, and NQO1) and constructed the CRRS prognostic model. The overall survival (OS) of the high-risk group was shorter than that of the low-risk group. The AUC values for 1 -, 3 -, and 5-year OS in the TCGA dataset were 0.769, 0.691, and 0.674, respectively. The Cox analysis showed that CRRS was an independent prognostic factor for HCC. A nomogram established with risk score, age, gender, grade, and stage, has a better prognostic value for HCC patients. In the high-risk group, CD4+T cells memory resting, NK cells activated, and B cells naive were significantly lower. The expression levels of immune checkpoint genes in the high-risk group were generally higher than that in the low-risk group. Conclusions: The CRRS model has reliable predictive value for the prognosis of HCC patients.
Keywords: Hepatocellular carcinoma; Coagulation-related gene; Prognosis; Risk score
Wan-Xia Yang and Hong-Wei Gao have contributed equally to this work and share first authorship
Global cancer statistics in 2020 indicates that there are approximately 906,000 new cases and 830,000 deaths of primary liver cancer worldwide [[
Tumor patients usually show a hypercoagulable state [[
TME is closely related to tumor generation, survival, and metastasis of HCC [[
At present, few studies have explored the prognosis and immune status of HCC from the perspective of CRGs. In this study, we screened CRGs for HCC by multiple datasets and Genecards database. A coagulation-related risk score (CRRS) prognostic model for HCC was constructed based on CRGs in the TCGA dataset and validated for predictive ability in the ICGC dataset. Further, we analyzed the correlation between risk score and clinicopathological indicators. Then, combined with risk score and other prognostic clinical indicators, a nomogram model was constructed to quantify the survival probability. Finally, the relationship between risk score and functional enrichment, as well as TME was studied. This study provides a molecular basis for the complex mechanism of HCC and a potential target for the treatment of HCC.
The expression profile data and corresponding platform annotation information of microarray datasets GSE54236 and GSE102079 were downloaded from the GEO database (https://
We firstly converted the probes into gene symbols through the corresponding platform annotation information. The GSE54236, GSE102079, and TCGA-LIHC datasets were normalized by using "limma" package. We finally identified the differentially expressed genes (DEGs) with the cut-off conditions of |log
We firstly used R software package "survival" for univariate Cox regression analysis to identify the genes that were significantly associated with survival by calculating the relationship between DECRGs and overall survival (OS) in the TCGA-LIHC dataset. Then, the selected genes were further screened by LASSO regression with the R package "glmnet". The variation in regression coefficients of the prognostic genes was identified by selecting the optimal and minimal criteria of the penalization parameter. Subsequently, multivariate Cox regression analysis helped to determine the key CRGs and establish the CRRS prognostic model. Akaike information criterion (AIC) was used to measure the accuracy and brevity of the model. The model with the lowest AIC value was considered to be the most simple and effective model with the least information loss when predicting the result. The concordance index (C-index) was also calculated as a measure of accuracy in predicting survival outcomes.
A total of 18 human HCC and 16 adjacent tissues were collected in the Department of General Surgery, Lanzhou University Second Hospital. All experiments involving human tissues complied with the principles of the Declaration of Helsinki and have been approved by the Medical Ethics Committee of Lanzhou University Second Hospital. The total RNA was extracted by using TRNzol Reagent, and was reverse-transcribed with FastKing gDNA Dispelling RT SuperMix (TIANGEN, Beijing, China). All qPCR reactions were conducted with RotorGene 6000 PCR system (Qiagen) and performed with SsoFast EvaGreen Supermix (Bio-Rad). The relative expression of the gene was calculated by the 2
A total of 370 patients were divided into high-risk and low-risk groups based on median risk score (0.936) in the TCGA-LIHC dataset. The accuracy of the CRRS model was evaluated by Kaplan–Meier survival curve and ROC curve. We used the "survminer" package to draw Kaplan–Meier survival curve to analyze the relationship between risk score and OS. The package "survival" and "timeROC" were used to draw the ROC curve. The AUC value of the ROC curve was calculated to evaluate the performance of the prognostic model. In addition, the "pheatmap" package was used to draw risk curve, survival status plot, and heatmap of model genes in high-risk and low-risk groups. The ICGC-LIRI-JP dataset was employed for external validation of the prediction effect of the model.
Univariate and multivariate Cox regression analyses were conducted on the TCGA and ICGC datasets to evaluate whether the CRRS can be used as an independent prognostic factor. To quantify the survival probability of patients, a nomogram integrating the risk score and age, gender, grade, and stage was constructed by the "rms" package. Simultaneously, a decision curve analysis (DCA) was performed with "ggDCA" package to determine the clinical application value of the risk score model by calculating the net benefits. Further, the calibration curve was drawn to evaluate the predictive accuracy of the nomogram.
To understand the functions of DEGs between the high- risk and low-risk groups, we used the "org.Hs.eg.db" and "clusterProfiler" packages to conduct the Gene Ontology (GO) enrichment analysis and Gene Set Enrichment Analysis (GSEA). GO enrichment analysis includes biological process (BP), cellular component (CC) and molecular function (MF). Adjusted P < 0.05 was considered as statistically significant.
The CIBERSORT database (
The Wilcoxon test was employed to contrast the successive variates between the two groups. The Kruskal–Wallis test was used for multiple comparisons. The chi-squared test was used for the comparison of categorical variable data between the two groups. Hazard ratios and 95% Confidence Interval were calculated using univariate and multivariate Cox analyses. All statistical P values were two-sided. *P < 0.05 was considered to be statistically significant. Statistical analyses were performed with the R 4.1.0 software and the GraphPad Prism 8.0.1 software.
To obtain DEGs of HCC and control group, we analyzed the GSE54236, GSE102079 and TCGA-LIHC datasets with a Bayesian test. The screening criterion was | log
Graph: Fig. 1 Identification of DECRGs. A The volcano plot of DEGs in the GSE54236 dataset. B The volcano plot of DEGs in the GSE102079 dataset. C The volcano plot of DEGs in the TCGA-LIHC dataset. D The Venn diagram of share DEGs among GSE54236, GSE102079, and TCGA-LIHC datasets. E The Venn diagram of DECRGs
To identify the key CRGs with prognosis significance, we conducted univariate Cox regression analysis and screened out 28 genes significantly related to the survival of HCC with P < 0.05 in the TCGA dataset, of which 19 were risk factors and 9 were protective factors (Fig. 2A). Further, 9 genes were obtained by LASSO regression analysis with the minimal penalization parameter. (Fig. 2B–C). Finally, 5 key CRGs were screened by multivariate Cox analysis and a CRRS prognostic model was constructed with AIC value is 1245.48 and C-index is 0.7 (Fig. 2D). The 5 key CRGs were as follows: FLVCR1, CENPE, LCAT, CYP2C9, NQO1. The prognostic model showed that FLVCR1, CENPE and NQO1 were risk factors, while LCAT and CYP2C9 were protective factors. The coefficient of CRGs was presented in Table1. The CRRS prognostic model for calculating HCC risk score was as follows: risk score = (0.1065 * FLVCR1) + (0.3748 * CENPE) + (− 0.0082 * LCAT) + (− 0.0019 * CYP2C9) + (0.0019 * NQO1).
Graph: Fig. 2 Identification of a CRRS prognostic model for HCC. A The forest plot of prognostic CRGs identified by univariate Cox analysis. B Cross validation for tuning parameter selection in the LASSO regression analysis. C The 9 key CRGs were selected by the LASSO regression analysis
Table 1 The coefficient of the 5 genes
Gene symbol Gene description Coefficient FLVCR1 Feline leukemia virus subgroup C cellular receptor 1 0.1065 CENPE Centromere protein E 0.3748 LCAT lecithin-cholesterol acyltransferase − 0.0082 CYP2C9 Cytochrome P450, family 2, subfamily C, polypeptide 9 − 0.0019 NQO1 NAD(P)H dehydrogenase, quinone 1 0.0019
Meanwhile, we analyzed the expression of key CRGs in 371 LIHC tissues and 50 normal tissues in the UALCAN online tool. The results revealed FLVCR1, CENPE and NQO1 were significantly higher expressed in HCC group, while LCAT and CYP2C9 were significantly lower compared with the control group (Fig. 3). To confirm the expression of key CRGs, we collected HCC and normal tissues from Lanzhou University Second Hospital, and performed qPCR to detect the relative mRNA expression of key CRGs. The results were consistent with online data. The results suggested that FLVCR1, CENPE and NQO1 had a higher expression, while LCAT and CYP2C9 had a lower expression compared with the normal group (Fig. 4).
Graph: Fig. 3 The expression levels of key CRGs in HCC and normal tissues
Graph: Fig. 4 Relative mRNA expression of key CRGs in HCC and normal tissues detected by qPCR. *P < 0.05
To evaluate and validate the predictive potential of the risk score prognostic model, we analyzed the differences in survival between the high-risk and low-risk groups. The results showed that the OS of the high-risk group was shorter than that of the low-risk group (P < 0.001) in the TCGA and ICGC datasets (Fig. 5A and D). The time-dependent ROC curve demonstrated that the AUC values for 1 -, 3 -, and 5-year OS in the TCGA dataset were 0.769, 0.691, and 0.674, respectively (Fig. 5B). Similarly, in the ICGC dataset, the AUC values for 1-, 3-, and 5-year OS were 0.787, 0.736, and 0.312, respectively (Fig. 5E). Furthermore, in the ROC curve containing risk score, age, gender, stage, and grade, the AUC value of the risk score was higher than that of other indicators in the TCGA dataset (Fig. 5C). In the ICGC dataset, the AUC value of the risk score was higher than that of other indicators except for stage (Fig. 5F). In addition, as the risk score increased, so did the number of deaths. Compared with the low-risk group, FLVCR1, CENPE, and NQO1 were highly expressed in the high-risk group, while LCAT and CYP2C9 were lowly expressed (Fig. 5G–I). The results of ICGC were consistent with TCGA (Fig. 5J–L).
Graph: Fig. 5 Evaluation and validation of the survival and risk of CRRS model. A, D The survival analysis of CRRS model in the TCGA-LIHC and ICGC-LIRI-JP datasets. B, E The ROC analysis of CRRS model in the TCGA-LIHC and ICGC-LIRI-JP datasets. C, F The ROC curve analysis of the CRRS model and clinical indicators in the TCGA-LIHC and ICGC-LIRI-JP datasets. G-I The distribution of risk score, survival status, and the heatmap of the 5 genes in the TCGA-LIHC dataset. J-L The distribution of risk score, survival status, and the heatmap of the 5 genes in the ICGC-LIRI-JP dataset
To determine whether the risk score prognostic model is applicable to HCC patients with different clinical characteristics, we analyzed the correlation between risk score and age, gender, pathological grades, tumor stages, and TNM stages, and identified a significant correlation between risk score and pathological grades, tumor stages, and T stages in the TCGA-LIHC dataset (P < 0.001) (Fig. 6A). The Kruskal–Wallis test showed that there were differences in risk score among pathologic grades, tumor stages, and T stages, although the differences in the risk score among some subgroups were not statistically significant (P > 0.05). The risk score tended to rise with pathological grades, tumor stages, and T stages (Fig. 6B–D).
Graph: Fig. 6 The risk score with the clinical indicators in the TCGA-LIHC dataset. A The heatmap for the 5 genes based on risk score and clinical indicators, ***P < 0.001. B The boxplot of risk score based on CRGs in HCC patients with different grades. C The boxplot of risk score in HCC patients with different tumor stages. D The boxplot of risk score in HCC patients with different T stages
Since CRRS was significantly associated with the malignancy of HCC, we determined whether CRRS was a clinically independent prognostic factor for HCC patients by univariate and multivariate Cox regression analyses. As predicted, the results suggested that CRRS is an independent prognostic factor for HCC (Fig. 7A–D). Subsequently, based on the TCGA dataset, we further constructed the nomogram of risk score, age, gender, grade, and stage, which provided a visual method for predicting the 1 -, 3 -, and 5-year survival probability of HCC patients (Fig. 7E). The DCA showed that CRRS has a higher clinical net benefit than other clinical indicators (Fig. 7F). The calibration curve displayed that there was a good agreement between the survival probability predicted by the nomogram and the actual observed probability (Fig. 7G). These results suggested that the established nomogram has a good prognostic value for HCC patients.
Graph: Fig. 7 Construction and validation of the clinical prognostic model. A, B Univariate and multivariate Cox regression analyses based on risk score and other clinical indicators in the TCGA-LIHC dataset. C, D Univariate and multivariate Cox regression analyses based on risk score and other clinical indicators in the ICGC-LIRI-JP dataset. E The nomogram for predicting the probability of 1-, 3-, and 5-year OS for HCC patients. F The DCA of the 1-year survival probability in the TCGA-LIHC dataset. G The calibration curve of the nomogram for predicting 1-, 3-, and 5-year survival probability. ***P < 0.001
To further explore the differences in the gene functions and pathways between the subgroups classified by CRG risk score, we identified the 5707 DEGs between the high-risk and low-risk groups, which mainly enriched in the ribosome and spliceosome. Biological process involved ribosome biogenesis, RNA splicing, and cytoplasmic translation. The cellular component mainly focused on ribosome and spliceosome. The molecular function analysis showed that most of genes were involved in transcription coregulator activity, cadherin binding, ubiquitin-like protein ligase binding, GTPase binding, and ribonucleoprotein complex binding (Fig. 8A).
Graph: Fig. 8 Function enrichment analysis. A GO enrichment analysis. The size of the circle indicates the number of genes. The screening criterion was set as adjusted P < 0.05. B-C Enrichment plots from GSEA analysis in the low-risk and high-risk groups. For more information about KEGG pathway, see https://
GSEA analysis indicated that the most enriched pathways in the low-risk group were complement and coagulation cascades, drug metabolism cytochrome p450, fatty acid metabolism, metabolism of xenobiotics by cytochrome p450, and peroxisome (Fig. 8B). In contrast, cell cycle, DNA replication, ECM receptor interaction, hematopoietic cell lineage, and primary immunodeficiency were enriched in the high-risk group (Fig. 8C).
We investigated the immune cell differences between the high-risk and low-risk groups, and revealed that compared with the low-risk group, Macrophages M0 was significantly higher expressed in the high-risk group, while CD4
Graph: Fig. 9 Risk score and TME. A The radar map of the 22 immune cells. B The boxplot of immune-function score in high-risk and low-risk groups. C The differences in immune checkpoint genes between high-risk and low-risk groups. *P < 0.05, **P < 0.01, ***P < 0.001
The occurrence of HCC is a complex biological process. It is particularly important to explore the pathogenesis of HCC, find new biomarkers at the molecular level, and achieve early diagnosis and treatment [[
In this study, we identified 5 key CRGs to construct a prognostic model. The prognostic model showed that FLVCR1, CENPE and NQO1 were risk factors, while LCAT and CYP2C9 were protective factors. Meanwhile, compared with the control group, FLVCR1, CENPE and NQO1 were significantly higher expressed in HCC, while LCAT and CYP2C9 were significantly lower expressed. According to growing evidence, CRGs are associated with the prognosis of many malignancies. Studies have shown that FLVCR1 plays a crucial role in various biological processes such as cell proliferation and apoptosis, and is significantly highly expressed in HCC, which is associated with increased cell proliferation and invasion [[
HCC patients were divided into two groups based on risk score. FLVCR1, CENPE, and NQO1 were highly expressed in the high-risk group, while LCAT and CYP2C9 were highly expressed in the low-risk group. This further suggested that CRGs may be the driver genes of HCC development. The survival curve showed that CRRS was correlated with OS, and the OS in the high-risk group was significantly shorter than that in the low-risk group. The ROC curve showed that the CRRS effectively predicted the OS of HCC patients. Univariate and multivariate Cox regression analyses showed that CRRS was an independent prognostic factor. It is suggested that clinical indicators should be considered when screening prognostic features [[
To explore possible intrinsic differences in different prognosis of HCC, we performed enrichment analysis and immune analysis. Functional enrichment involved ribosome and spliceosome. GSEA analysis indicated cell cycle, DNA replication, ECM receptor interaction, hematopoietic cell lineage, and primary immunodeficiency were enriched in the high-risk group. It is well known that dysregulation of these processes will lead to the occurrence and metastasis of tumors [[
In addition, this study also had some limitations. Firstly, due to the lack of complete clinicopathological information, we collated some clinical data for analysis. Secondly, although the relationship between CRGs and HCC prognosis has been found in HCC patients, the mechanism behind these phenomena remains unclear, and a large number of experiments are still needed to further study the role of CRGs in HCC.
In this study, we identified 5 key CRGs associated with HCC. The CRRS prognostic model constructed based on CRGs can effectively predict the prognosis of HCC patients. The combination of the risk score with other clinical indicators increased its clinical application potential. The risk score was also correlated with the TME. In addition, these HCC-associated CRGs may become new targets for the diagnosis or treatment of HCC.
The authors acknowledge the TCGA project, the GEO database, the ICGC project, and other groups for providing invaluable datasets for statistical analyses.
W-X Y, H-W G, and C-G Y conceived and designed the study. W-X Y, H-W G, J-B C, and A-A Z made the diagrams and tables of the article. W-X Y and H-W G wrote the paper. F-F W, J-Q X, and M-H L revised the article. All the authors read and approved the manuscript.
This work was supported by the Gansu Province Youth Science and Technology Foundation (21JR7RA421), the Cuiying Scientific and Technological Innovation Program of Lanzhou University Second Hospital (CY2021-BJ-A16, CY2022-QN-A18), and the Gansu Province Natural Science Foundation (21JR7RA407, 22JR11RA055).
The datasets used in this study can be found in the GEO database (https://
All experiments involving human tissues complied with the principles of the Declaration of Helsinki. This study was approved by the Medical Ethics Committee of Lanzhou University Second Hospital with approval number 2022A-441. Written informed consent was obtained from all participants in this study.
Not applicable.
The authors declare that they have no competing interests.
Graph: Additional file 1: Table S1. The clinical information of HCC samples in the TCGA dataset and the ICGC dataset.
Graph: Additional file 2: Table S2. List of primers.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
By Wan-Xia Yang; Hong-Wei Gao; Jia-Bo Cui; An-An Zhang; Fang-Fang Wang; Jian-Qin Xie; Ming-Hua Lu and Chong-Ge You
Reported by Author; Author; Author; Author; Author; Author; Author; Author