Ann-Kathrin Dörr (Essen / DE), Sultan Imangaliyev (Essen / DE), Folker Meyer (Essen / DE), Ivana Kraiselburd (Essen / DE)
The ability to detect significant changes in bacterial communities in patients could be a step toward early detection of infectious diseases such as sepsis. This could lead to a higher chance of survival for the patient [1]. We seek to predict the changes in the abundance of bacterial genera in samples over time through analysis of 16S rRNA gene amplicon time series data. For this, we are employing Long Short-Term Memory (LSTM) [2] models for prediction and Shapley Additive Explanations (SHAP) [3] for feature importance analysis. So far, the model demonstrated a good performance for the prediction of the overall abundance range of bacterial genera in patient samples over time. Outlier detection is implemented to distinguish significant changes from normal fluctuations. As wastewater surveillance gained importance during the COVID-19 pandemic, we employed similar models on time series data from wastewater treatment plants. With the use of machine learning models on time series data for anomaly detection we hope to provide a treatment advantage for physicians and patients. In an environmental context we have the desire to help create an environmental pathogens surveillance and an early warning system. In the future we plan on testing different model architectures and data types to achieve even better results.
References:
[1] Ferrer, R. et al. Empiric Antibiotic Treatment Reduces Mortality in Severe Sepsis and Septic Shock From the First Hour. 2014. Critical care medicine.
[2] Baranwal, M. et al. Recurrent neural networks enable design of multifunctional synthetic human gut microbiome dynamics. 2022. eLife.
[3] Lundberg, S.M., Lee S.-I. et al. A Unified Approach to Interpreting Model Predictions. Part of NIPS 2017.