华东师范大学学报(自然科学版) ›› 2024, Vol. 2024 ›› Issue (4): 100-110.doi: 10.3969/j.issn.1000-5641.2024.04.010

• 生命科学 • 上一篇    

基于生物信息学构建肝癌免疫预后基因模型及初步验证

谢琳玎1,2, 张远3, 蔡亦红1,2,*()   

  1. 1. 安徽医科大学 公共卫生学院 卫生检验与检疫学系, 合肥 230032
    2. 安徽医科大学 动物源性传染病安徽省重点实验室/人畜共患病安徽高校省级重点实验室, 合肥 230032
    3. 苏州市吴江区儿童医院 检验科, 江苏 苏州 215234
  • 收稿日期:2023-02-10 接受日期:2023-06-29 出版日期:2024-07-25 发布日期:2024-07-23
  • 通讯作者: 蔡亦红 E-mail:yihongcai2022@163.com
  • 基金资助:
    安徽医科大学博士基金 (XJ202005); 安徽医科大学基础和临床合作研究计划项目 (2021xkjT033)

Bioinformatics-based construction of immune prognostic gene model for hepatocellular carcinoma and preliminary model validation

Linding XIE1,2, Yuan ZHANG3, Yihong CAI1,2,*()   

  1. 1. Department of Health Inspection and Quarantine, School of Public Health, Anhui Medical University, Hefei 230032, China
    2. Anhui Province Key Laboratory of Zoonoses, Anhui Medical University, Hefei 230032, China
    3. Department of Clinical Laboratory, Suzhou Wujiang District Children's Hospital, Suzhou, Jiangsu 215234, China
  • Received:2023-02-10 Accepted:2023-06-29 Online:2024-07-25 Published:2024-07-23
  • Contact: Yihong CAI E-mail:yihongcai2022@163.com

摘要:

利用癌症基因组图谱 (the Cancer Genome Atlas, TCGA) 和国际肿瘤基因组协作组 (International Cancer Genome Consortium, ICGC) 数据库收集肝细胞癌 (hepatocellular carcinoma, HCC) 患者的RNA测序信息. 首先, 通过非负矩阵分解 (non-negative matrix factorization, NMF) 聚类方法和加权基因共表达网络分析 (weighted gene co-expression network analysis, WGCNA) 筛选出参与HCC免疫反应机制的关键基因. 利用套索 (the least absolute shrinkage and selection operator, LASSO) 回归分析构建预后基因模型, 并用基因集富集分析 (gene set enrichment analysis, GSEA) 方法分析生物学功能. 随后, 对不同风险组患者使用单样本基因集富集分析 (single sample genes set enrichment analysis, ssGSEA) 评估两组间免疫浸润和相关功能差异. 使用 “RMS” R软件包结合独立危险因素构建列线图以预测患者的总体生存时间. 最后, 利用人类蛋白质图谱数据库(Human Protein Atlas, HPA)与实时荧光定量PCR (real-time quantitative PCR, RT-qPCR) 进行临床初步验证. 总之, 本文在风险评分的基础上整合患者临床特征, 构建了一个可验证、可重复的列线图, 为临床肿瘤患者的精准治疗提供可靠的参考.

关键词: 生物信息学, 加权基因共表达网络分析, 肝癌, 免疫相关基因, 预后模型

Abstract:

The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) databases were used to collect RNA sequence information from patients with hepatocellular carcinoma (HCC). The key genes involved in the immune response mechanism to HCC were screened using the non-negative matrix factorization (NMF) clustering method and weighted gene co-expression network analysis (WGCNA). Prognostic gene models were constructed using the least absolute shrinkage and selection operator (LASSO) regression analysis, and biological functions were analyzed using gene set enrichment analysis (GSEA). Subsequently, to assess the immune infiltration and the related functional differences between the patients in two different risk groups , we used single-sample gene set enrichment analysis (ssGSEA). We constructed column line graphs in combination with independent risk factors to predict overall patient survival time using the “RMS” package in R. Finally, preliminary clinical validation was performed using the Human Protein Atlas (HPA) database with real-time quantitative fluorescent PCR (RT-qPCR). In conclusion, we integrated the clinical characteristics of patients based on risk scores to construct a verifiable and reproducible column line chart, providing a reliable reference for the precise treatment of patients in clinical oncology.

Key words: bioinformatics, weighted gene co-expression network analysis, hepatocellular carcinoma, immune-related genes, prognostic model

中图分类号: