Email Alert | RSS

Chinese Journal of Antituberculosis ›› 2025, Vol. 47 ›› Issue (8): 1053-1061.doi: 10.19982/j.issn.1000-6621.20250033

• Original Articles • Previous Articles     Next Articles

The value of machine learning algorithm-based diagnostic models in tuberculous pleural effusion

Jiao Jiahuan1,2, Sun Changfeng2,3, Wu Gang2, Huang Fuli2, Sheng Yunjian2()   

  1. 1Department of Infectious Diseases, The People’s Hospital of Leshan, Leshan 614000, China
    2Department of Infectious Diseases, The Affiliated Hospital of Southwest Medical University, Luzhou 646000, China
    3Infection and Immunity Laboratory, The Affiliated Hospital of Southwest Medical University, Luzhou 646000, China
  • Received:2025-01-21 Online:2025-08-10 Published:2025-08-01
  • Contact: Sheng Yunjian, Email: sheng200410@163.com
  • Supported by:
    Sichuan Provincial Key Clinical Specialty Construction Project for Infectious Diseases(川卫医政函〔2024〕116号)

Abstract:

Objective: To explore the value of artificial intelligence Machine Learning Algorithm (MLA) based diagnostic models for tuberculous pleural effusion (TPE) diagnosis. Methods: A retrospective study was conducted. All of 233 patients with pleural effusion admitted to The People’s Hospital of Leshan from January 2020 to September 2022 were enrolled as an internal experimental group according to inclusion criteria. Patients were categorized into tuberculosis group (n=106) and non-tuberculosis group (n=127) based on TPE diagnosis. Clinical data were processed and analyzed using R software (version 4.1.1). Least absolute shrinkage and selection operator (LASSO) regression were employed for variable selection, followed by the development of four MLA-based diagnostic models: random forest (RF), support vector machine with linear kernel (SVM-linear), support vector machine with polynomial kernel (SVM-polynomial), and multivariate logistic regression. The diagnostic performance of each model was evaluated using the area under the receiver operating characteristics curve (AUC), and compared with the pleural adenosine deaminase (ADA). External validation was conducted using an independent cohort of 141 pleural effusion patients (101 with TPE and 40 without TPE) from The Affiliated Hospital of Southwest Medical University during the same period. Results: LASSO regression analysis identified total pleural protein, pleural ADA, mononuclear cell ratio in pleural fluid, serum neutrophil ratio, platelet count, fever, and night sweats as risk factors for TPE (penalty coefficients: 0.216, 0.058, 0.003, 0.049, 0.000, 0.045, 1.605, respectively), whereas pleural carcinoembryonic antigen (CEA), polymorphonuclear cell ratio in pleural fluid, and peripheral white blood cell count were associated with a lower risk of TPE (penalty coefficients: -0.072, -0.029, -0.567, respectively). The four MLA-based diagnostic models demonstrated TPE diagnostic sensitivities of 91.8% (RF), 84.5% (SVM-linear), 86.9% (SVM-polynomial), and 85.4% (multivariate logistic regression); specificities of 99.0%, 81.6%, 93.8%, and 81.6%; and AUC values of 0.988, 0.875, 0.959, and 0.886, respectively, all exceeding pleural effusion ADA performance (sensitivity 83.1%, specificity 77.9%, AUC 0.820). In the external validation cohort, the AUCs of the RF, SVM-linear, SVM-polynomial, and logistic regression models were 0.834, 0.827, 0.817, and 0.815, respectively. Conclusion: The novel random forest based diagnostic model demonstrated the best diagnostic performance for TPE identification, providing a simpler, more rapid, and clinical effective diagnostic approach.

Key words: Tuberculosis, pleural, Pleural effusion, Diagnosis, computer-assisted, Models, statistical, Algorithms, Artificial intelligence algorithms

CLC Number: