Email Alert | RSS    帮助

中国防痨杂志 ›› 2026, Vol. 48 ›› Issue (6): 830-839.doi: 10.19982/j.issn.1000-6621.20260040

• 论著 • 上一篇    下一篇

复治肺结核治疗结局的最优预测模型构建

杜锡龙1, 买吾拉江·依马木2, 那颜1, 帕孜力亚·牙生1, 郭刚3, 麦维兰江·阿不力米提4, 张利萍5, 郑彦玲5()   

  1. 1 新疆医科大学公共卫生学院, 乌鲁木齐 830017
    2 新疆维吾尔自治区喀什地区疾病预防控制中心, 喀什 844000
    3 新疆维吾尔自治区乌鲁木齐市妇幼保健院, 乌鲁木齐 830037
    4 新疆维吾尔自治区喀什地区结核病防治所(喀什地区肺科医院), 喀什 844000
    5 新疆医科大学医学工程技术学院, 乌鲁木齐 830017
  • 收稿日期:2026-01-22 出版日期:2026-06-10 发布日期:2026-05-25
  • 通信作者: 郑彦玲 E-mail:zhengyl_math@sina.cn
  • 基金资助:
    新疆青年拔尖人才——青年科技创新人才项目(2024TSYCCX0080);新疆青年拔尖人才——青年科技创新人才项目(2024TSYCJC0061);大学生创新创业训练计划项目(S202510760111)

Construction of an optimal prediction model for treatment outcomes in retreatment pulmonary tuberculosis

Du Xilong1, Maiwulajiang·Yimamu 2, Na Yan1, Paziliya·Yasheng 1, Guo Gang3, Maiweilanjiang·Abulimiti 4, Zhang Liping5, Zheng Yanling5()   

  1. 1 School of Public Health, Xinjiang Medical University, Urumqi 830017, China
    2 Kashgar Prefecture Center for Disease Control and Prevention, Xinjiang Uygur Autonomous Region, Kashgar 844000, China
    3 Urumqi Maternal and Child Health Hospital, Xinjiang Uygur Autonomous Region, Urumqi 830037, China
    4 Kashgar Prefecture Tuberculosis Control Institute (Kashgar Prefecture Pulmonary Hospital), Xinjiang Uygur Autonomous Region, Kashgar 844000, China
    5 School of Medical Engineering and Technology, Xinjiang Medical University, Urumqi 830017, China
  • Received:2026-01-22 Online:2026-06-10 Published:2026-05-25
  • Contact: Zheng Yanling E-mail:zhengyl_math@sina.cn
  • Supported by:
    Outstanding Young Talents in Xinjiang-Youth Science and Technology Innovation Talent Project(2024TSYCCX0080);Outstanding Young Talents in Xinjiang-Youth Science and Technology Innovation Talent Project(2024TSYCJC0061);College Students’ Innovation and Entrepreneurship Training Program Project(S202510760111)

摘要:

目的: 基于复治肺结核患者的临床数据,系统比较9种机器学习模型在治疗结局分类中的性能,以构建最优预测模型,并为优化复治肺结核患者管理提供预测工具。方法: 纳入2022年1月1日至12月31日新疆喀什地区登记管理的1396例复治肺结核患者,按7∶3比例随机划分为训练集(978例)与测试集(418例)。采用随机森林与Cramér’s V系数进行特征筛选,构建逻辑回归、支持向量机、决策树、随机森林、XGBoost、LightGBM、梯度提升树、多层感知机及CatBoost共9种机器学习模型,通过准确率、精确率、召回率、F1分数、曲线下面积(area under the curve,AUC)及平均精度(average precision,AP)评估模型性能,选择最优模型并运用SHAP(SHapley Additive exPlanations)分析进行解释。结果: CatBoost模型在测试集上的综合性能最优:准确率为89.2%,精确率为88.5%,召回率为89.2%,F1分数为0.885,AUC为0.829,AP为0.941。SHAP分析显示,2月序痰检(MeanSHAP绝对值为0.595)、治疗方案(MeanSHAP绝对值为0.367)和治疗模式(MeanSHAP绝对值为0.290)是对复治肺结核治疗结局预测贡献度最高的3个特征。结论: CatBoost模型在复治肺结核患者发生不良治疗结局预测中表现稳健且优异,结合SHAP解释可有效识别关键预测因子,可为复治肺结核患者发生不良治疗结局风险评估提供有效的工具。

关键词: 结核, 肺, 再治疗, 治疗结果, 模型, 统计学, 预测

Abstract:

Objective: Based on the clinical data of patients with retreatment pulmonary tuberculosis, this study systematically compared the performance of nine machine learning models in treatment outcome classification to construct the optimal prediction model and provide a predictive tool for optimizing the management of retreatment pulmonary tuberculosis patients. Methods: A total of 1396 patients with retreatment pulmonary tuberculosis registered and managed by the Kashgar in Xinjiang Uyghur Autonomous Region from January 1 to December 31, 2022 were enrolled. They were randomly divided into a training set (978 cases) and a test set (418 cases) at a 7∶3 ratio. Feature selection was performed using Random Forest and Cramér’s V coefficient. Nine machine learning models were constructed, including Logistic Regression, Support Vector Machine, Decision Tree, Random Forest, XGBoost, LightGBM, Gradient Boosting Tree, Multilayer Perceptron, and CatBoost. Model performance was evaluated using accuracy, precision, recall, F1-score, area under the curve (AUC), and average precision (AP). The optimal model was selected and interpreted using SHapley Additive exPlanations (SHAP) analysis. Results: The CatBoost model exhibited the best overall performance on the test set, with an accuracy of 89.2%, precision of 88.5%, recall of 89.2%, F1-score of 0.885, AUC of 0.829, and AP of 0.941. SHAP analysis revealed that 2-month sputum smear examination (absolute MeanSHAP value=0.595), treatment regimen (absolute MeanSHAP value=0.367), and treatment mode (absolute MeanSHAP value=0.290) were the three features with the highest contribution to predicting treatment outcomes in retreatment pulmonary tuberculosis. Conclusion: The CatBoost model performed robustly and excellently in predicting adverse treatment outcomes in patients with retreatment pulmonary tuberculosis. Combined with SHAP interpretation, it can effectively identify key predictors and provide an effective tool for risk assessment of adverse treatment outcomes in retreatment pulmonary tuberculosis patients.

Key words: Tuberculosis, pulmonary, Retreatment, Treatment outcome, Models, statistical, Forecasting

中图分类号: