by Xie Qiu, Shuo Hu, Shumin Dong, Haijun Sun
ObjectiveTo develop a predictive framework integrating machine learning and clinical parameters for postoperative pulmonary complications (PPCs) in non-small cell lung cancer (NSCLC) patients undergoing video-assisted thoracic surgery (VATS).
MethodsThis retrospective study analyzed 286 NSCLC patients (2022–2024), incorporating 13 demographic, metabolic-inflammatory, and surgical variables. An Improved Blood-Sucking Leech Optimizer (IBSLO) enhanced via Cubic mapping and opposition-based learning was developed. Model performance was evaluated using AUC-ROC, F1-score, and decision curve analysis (DCA). SHAP interpretation identified key predictors.
ResultsThe IBSLO demonstrated significantly superior convergence performance versus original BSLO, ant lion optimizer (ALO), Harris hawks optimization (HHO), and whale optimization algorithm (WOA) across all 12 CEC2022 test functions. Subsequently, the IBSLO-optimized automated machine learning (AutoML) model achieved ROC-AUC/PR-AUC values of 0.9038/0.8091 (training set) and 0.8775/0.8175 (testing set), significantly outperforming four baseline models: logistic regression (LR), support vector machine (SVM), XGBoost, and LightGBM. SHAP interpretability identified six key predictors: preoperative leukocyte count, body mass index (BMI), surgical approach, age, intraoperative blood loss, and C-reactive protein (CRP). Decision curve analysis demonstrated significantly higher net clinical benefit of the AutoML model compared to conventional methods across expanded threshold probability ranges (training set: 8–99%; testing set: 3–80%).
ConclusionThis study establishes an interpretable machine learning framework that improves preoperative risk stratification for NSCLC patients, offering actionable guidance for thoracic oncology practice.