Email Alert | RSS    帮助

中国防痨杂志 ›› 2026, Vol. 48 ›› Issue (1): 139-147.doi: 10.19982/j.issn.1000-6621.20250352

• 论著 • 上一篇    下一篇

基于GEO数据库筛选活动性结核病胞葬作用相关关键基因的研究

范维肖, 周柯, 刘家云()   

  1. 空军军医大学西京医院检验科,西安710032
  • 收稿日期:2025-08-29 出版日期:2026-01-10 发布日期:2025-12-31
  • 通信作者: 刘家云 E-mail:jiayun@fmmu.edu.cn
  • 基金资助:
    国家重点研发计划(2022YFC2603705)

Identification of efferocytosis-related core genes in active tuberculosis patients based on GEO database

Fan Weixiao, Zhou Ke, Liu Jiayun()   

  1. Department of Laboratory Medicine, Xijing Hospital of Air Force Medical University, Xi’an 710032, China
  • Received:2025-08-29 Online:2026-01-10 Published:2025-12-31
  • Contact: Liu Jiayun E-mail:jiayun@fmmu.edu.cn
  • Supported by:
    National Key Research and Development Program of China(2022YFC2603705)

摘要:

目的: 筛选与胞葬作用相关的基因作为区分活动性结核病(active tuberculosis, ATB)与结核分枝杆菌潜伏感染(latent tuberculosis infection, LTBI)的生物标志物。方法: 从GEO数据库下载由46例ATB患者和25名LTBI人群外周血微阵列芯片数据所组成的数据集(GSE28623),采用R语言的limma包,筛选差异表达基因(differentially expressed genes, DEGs);采用CIBERSORT包进行免疫浸润分析,评估ATB患者和LTBI人群免疫浸润状态;对在ATB患者中上调的基因进行基因本体论(GO)及京都基因和基因组百科全书(KEGG)信号通路分析;通过对DEGs与胞葬作用相关基因集取交集,筛选胞葬作用相关差异基因(efferocytosis-related differentially expressed genes, EF-DEGs)进行LASSO回归和SVM-RFE机器学习,筛选ATB关键基因。最后使用数据集GSE101705进一步验证关键基因的表达量,并通过绘制受试者工作特征曲线(receiver operating characteristic, ROC)评估关键基因区分ATB和LTBI的能力。结果: 通过差异分析,识别出460个上调差异表达基因和991个下调差异表达基因;免疫浸润提示,ATB患者中性粒细胞(P=1.45×10-7)和M0巨噬细胞(P=7.55×10-6)比例明显升高,而初始CD4+ T细胞(P=0.003)、CD8+ T细胞(P=1.45×10-7)和初始B细胞(P=0.026)等细胞比例明显下降;GO富集分析结果显示,ATB中上调的DEGs在生物过程中主要富集在对细菌来源分子的反应、脂多糖反应和髓系白细胞活化等信号通路;在细胞组分上主要定位在分泌颗粒腔、胞质囊泡腔和囊泡腔等;在分子功能方面主要涉及免疫受体活性、模式识别受体活性和脂多糖结合等方面。KEGG富集通路分析发现,ATB上调DEGs主要参与了脂质和动脉粥样硬化、胞葬作用和NOD样受体信号通路等多条信号通路;通过DEGs与胞葬作用基因列表取交集,筛选到13个EF-DEGs(PROS1SIAH2CD274WDFY3SCARF1ABCA1DYNLT1FPR2CLUIL1BTNFSF13BITGB3PLAUR)。利用LASSO回归和SVM-RFE机器学习进行关键基因筛选并取交集,筛选出CD274PROS1SIAH2作为ATB关键基因;通过外部数据集GSE101705进一步验证显示,CD274PROS1SIAH2在ATB患者外周血表达量显著高于LTBI,并且三者鉴别ATB的ROC曲线下面积(area under the curve, AUC)分别达到89.5%(95%CI:79.9%~99.1%)、88.4%(95%CI:78.7%~98.1%)和79.0%(95%CI:64.6%~93.5%),三者联合鉴别ATB的AUC值达到93.6%(95%CI:86.5%~100.0%)。结论: 基于公共数据库对ATB患者和LTBI人群外周血基因表达水平进行分析,发现CD274PROS1SIAH2具有较好的鉴别价值,是潜在的ATB鉴别的生物标志物。

关键词: 结核, 感染, 数据库, 胞葬作用, 基因

Abstract:

Objective: To screen efferocytosis-related genes as biomarkers for distinguishing active tuberculosis (ATB) and latent tuberculosis infection (LTBI) in human patients. Methods: The microarray dataset GSE28623, comprising peripheral blood samples from 46 ATB patients and 25 LTBI individuals, was downloaded from the GEO database. The R package limma was used to screen out differentially expressed genes (DEGs). Immune infiltration analysis was performed using CIBERSORT to evaluate the immune microenvironment status in patients with ATB and LTBI; subsequent GO and KEGG pathway enrichment analysis were conducted on the upregulated DEGs identified in ATB patients. Efferocytosis-related DEGs (EF-DEGs) were identified by intersecting the DEGs with an efferocytosis-related gene set. LASSO regression and SVM-RFE machine learning algorithms were then applied to screen for key ATB genes. Finally, dataset GSE101705 was used to further validate the expression levels of the key genes. ROC curves were plotted to evaluate the discriminatory performance of these key genes in distinguishing ATB from LTBI. Results: Differential analysis identified 460 upregulated DEGs and 991 downregulated DEGs. Immune infiltration analysis indicated that ATB patients exhibited significantly increased proportions of neutrophils (P=1.45×10-7) and M0 macrophages (P=7.55×10-6), while the proportions of naive CD4+ T cells (P=0.003), CD8+ T cells (P=1.45×10-7) and naive B cells (P=0.026) were significantly decreased. GO enrichment analysis revealed significant enrichment of upregulated DEGs in ATB across three ontologies: biological processes for responses to molecules of bacterial origin, response to lipopolysaccharide, and myeloid leukocyte activation; cellular components including secretory granule lumen, cytoplasmic vesicle lumen, and vesicle lumen; molecular functions involving immune receptor activity, pattern recognition receptor activity, and lipopolysaccharide binding. KEGG pathway analysis further demonstrated that these ATB-upregulated DEGs were prominently involved in lipid and atherosclerosis, efferocytosis, and the NOD-like receptor signaling pathway. Thirteen EF-DEGs (PROS1, SIAH2, CD274, WDFY3, SCARF1, ABCA1, DYNLT1, FPR2, CLU, IL1B, TNFSF13B, ITGB3, PLAUR) were identified through intersecting the upregulated DEGs with the efferocytosis-related gene set. Through LASSO regression and SVM-RFE machine learning algorithms, CD274, PROS1, and SIAH2 were identified as key genes associated with ATB. External validation using dataset GSE101705 confirmed that the expression levels of CD274, PROS1, and SIAH2 were significantly higher in the peripheral blood of ATB patients compared to LTBI individuals. The AUC values for discriminating ATB were 89.5% (95%CI: 79.9%-99.1%), 88.4% (95%CI: 78.7%-98.1%), and 79.0% (95%CI: 64.6%-93.5%) for CD274, PROS1 and SIAH2, respectively. The combined discriminatory AUC value for the three genes reached 93.6% (95%CI: 86.5%-100.0%). Conclusion: Based on the analysis of public database peripheral blood gene expression profiles from ATB and LTBI patients, this study identified CD274, PROS1 and SIAH2 as potential biomarkers with good discriminatory value for ATB.

Key words: Tuberculosis, Infection, Database, Efferocytosis, Genes

中图分类号: