Email Alert | RSS    帮助

中国防痨杂志 ›› 2025, Vol. 47 ›› Issue (11): 1508-1514.doi: 10.19982/j.issn.1000-6621.20250224

• 论著 • 上一篇    下一篇

基于机器学习算法的诊断模型对结核性胸膜炎的应用价值

李婷婷1, 刘欢庆2, 雷倩3, 尤著宏2, 赵国连4()   

  1. 1西安市胸科医院药物临床试验机构办公室,西安 710100
    2西北工业大学信息化管理处,西安 710072
    3西安市胸科医院药剂科,西安 710100
    4西安市胸科医院检验科,西安 710100
  • 收稿日期:2025-05-26 出版日期:2025-11-10 发布日期:2025-10-30
  • 通信作者: 赵国连 E-mail:774567495@qq.com

Application value of machine-learning-based diagnostic model on tuberculous pleurisy

Li Tingting1, Liu Huanqing2, Lei Qian3, You Zhuhong2, Zhao Guolian4()   

  1. 1Office of Drug Clinical Trial Institution, Xi’an Chest Hospital, Xi’an 710100, China
    2Department of Information Management, Northwestern Polytechnical University, Xi’an 710072, China
    3Department of Pharmacy, Xi’an Chest Hospital, Xi’an 710100, China
    4Department of Laboratory Medicine, Xi’an Chest Hospital, Xi’an 710100, China
  • Received:2025-05-26 Online:2025-11-10 Published:2025-10-30
  • Contact: Zhao Guolian E-mail:774567495@qq.com

摘要:

目的: 构建基于机器学习的结核性胸膜炎诊断预测模型,提高临床诊断准确性。方法: 回顾性收集2020年1月至2021年12月期间西安市胸科医院收治的523例胸腔积液患者(结核性胸膜炎375例,非结核性胸膜炎148例)的临床资料。纳入腺苷脱氨酶(adenosine deaminase, ADA)、结核感染T细胞斑点试验(T-SPOT.TB)、C反应蛋白(C-reactive protein, CRP)等15项指标,采用随机森林、支持向量机、神经网络等7种机器学习算法构建预测模型,通过5折交叉验证评估模型性能,使用SHapley加法解释(SHapley Additive exPlanations,SHAP))算法进行特征重要性分析。结果: 神经网络模型性能最优,测试集曲线下面积(area under the curve,AUC)为0.932,准确率为88.6%,精确率和召回率分别为94.4%和89.3%。SHAP分析显示,ADA(SHAP值为0.12~0.18)和T-SPOT.TB(SHAP值为0.10~0.15)是最重要的预测因子,且两者存在显著协同效应(P<0.001)。结论: 本研究构建的神经网络模型具有较高的诊断效能,通过可解释性分析明确了关键预测因子及其交互作用,为结核性胸膜炎的精准诊断提供了新工具。该模型可辅助临床决策,特别适用于传统诊断中的“灰色区域”病例。

关键词: 结核, 胸膜炎, 诊断, 计算机辅助, 模型,统计学, 人工智能

Abstract:

Objective: To develop a machine-learning-based predictive model for diagnosing tuberculous pleurisy (TBP) to improve clinical diagnostic accuracy. Methods: We retrospectively collected clinical data of 523 pleural effusion patients (375 with TBP and 148 with non-TBP) admitted in Xi’an Chest Hospital between January 2020 and December 2021. Fifteen indicators, including adenosine deaminase (ADA), tuberculosis infection T-cell spot test (T-SPOT.TB), and C-reactive protein (CRP), were incorporated. Seven machine learning algorithms, including random forest, support vector machine, and neural network, were employed to construct predictive models. Model performances were evaluated using 5-fold cross-validation. Feature importance was analyzed using SHapley Additive exPlanations (SHAP). Results: The model developed with Neural Network demonstrated optimal performance, achieving an area under the curve (AUC) of 0.932 on the test set, with an accuracy of 88.6%, precision of 94.4%, and recall rates of 89.3%. SHAP analysis identified ADA (SHAP value=0.12~0.18) and T-SPOT.TB (SHAP value=0.10~0.15) as two most significant predictors, with a notable synergistic effect (P<0.001). Conclusion: The Neural Network machine learning model developed in this study exhibited excellent diagnostic performance. Through interpretable analysis, key predictive factors and their interactions were elucidated, providing a novel tool for precise diagnosis of TBP. This model can assist clinical decision-making, particularly for cases in the “gray zone” under conventional diagnostic criteria.

Key words: Tuberculosis, Pleurisy, Diagnosis, computer-assisted, Models,statistical, Artificial intelligence algorithms

中图分类号: