Martin Misailovski (Göttingen / DE), Nikita Srivastava (Göttingen / DE), Sabine Blaschke (Göttingen / DE), Ralf Strobl (München / DE), Hani Kaba (Göttingen / DE), Milena Berens (Göttingen / DE), Andreas Beste (Göttingen / DE), Martin Kaase (Göttingen / DE), Andreas Leha (Göttingen / DE), Dhia Joseph Chackalackal (Göttingen / DE), Julia Chrampanis (Göttingen / DE), Jana-Michelle Kosub (Göttingen / DE), Uwe Groß (Göttingen / DE), Andreas Fischer (Göttingen / DE), Simone Scheithauer (Göttingen / DE)
Introduction: While machine learning algorithms are gaining popularity in clinical research, their feasibility as surveillance-related tools is still a work-in-progress.
Goals: To address this, we conducted a comprehensive evaluation of different machine learning models as survey-related tools for their performance on the example of the so-called Hospitalization Rate 2.0, which aims to differentiate cases admitted due to COVID-19 from incidental SARS-CoV-2 positive cases.
Methods: Single-center characterization of 345 patients hospitalized with a positive SARS-CoV-2 PCR test were included in the model development (Jan. 2022- June 2022). Outcomes were defined as: Primary case (admitted due to COVID-19) and incidental case (admitted for another reason). A total of 6 models were applied including 3 linear models (Logistic Regression (LR)), Linear Discriminant Analysis (LDA), Support Vector Classification (SVC)) and 3 non-linear models (K-Nearest Neighbors (KNN), Random Forest (RF), Extreme Gradient Boosting (XGboost)). Models were then evaluated according to the "Area Under the Curve" of the "Receiver Operating Characteristic" (AUC-ROC), accuracy, precision, recall, F1-score, sensitivity, and specificity.
Results: Among the linear classifiers, LR showed the best performance over all metrics (accuracy (0.75), precision (0.74), recall (0.75), F1-score (0.74) and sensitivity (0.75)), followed by LDA and SVC which had better specificity values (0.83 and 0.86 respectively). From the non-linear classifiers, RF was superior to KNN in terms of accuracy, precision, recall, F1 score and sensitivity, while KNN had higher specificity. Of all the applied classifiers, the XGBoost model had the strongest discriminative ability with the highest accuracy (0.80), sensitivity (0.92), precision (0.82), recall (0.78) and F1 score (0.79).[sg1]
Summary: Our results indicated that the XGBoost model has better capability and accuracy than the other models applied for identifying primary cases. Implementation of such models as surveillance-related tools could be the very first step to count primary cases, and furthermore optimize resource allocation and hospital bed occupancy.