ANTIMICROBIAL RESISTANCE PREDICTION AND KNOWLEDGE EXTRACTION FROM IRREGULAR TIME SERIES USING DYNAMIC TIME WARPING AND LONGEST COMMON SUBSEQUENCES ALGORITHMS
Abstract
Antimicrobial multidrug resistance (AMR) is considered one of the most dangerous threats to global health. Such is the case that the World Health Organization approved a Global Action Plan to address this growing problem. AMR has a huge impact on health and the economy, causing the failure of hospital treatments, an increase in mortality, and an increase in economic burden due to longer hospital stays. Moreover, patients in the Intensive Care Unit are specially at risk because they are critically ill. To address this global problem, different Artificial Intelligence models and data science approaches have been introduced to predict AMR, reducing the time of diagnosis concerning typically carried out susceptibility tests and providing clinicians with the necessary information for appropriate treatment. This end-of-degree project aims to contribute to the state of the art with the use of irregular multivariate time series and two machine learning methods, namely, Logistic Regression (LR) and Support Vector Machine, to predict AMR. Additionally, we applied Longest Common Subsequences (LCSS) and Dynamic Time Warping (DTW) algorithms to maintain the temporal structure of the data. Moreover, the use of visualization techniques and clustering methods contributed to the second objective of the extraction of knowledge from the data. T-distributed Stochastic Neighbor Embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP) were the visualization models used, whereas Spectral Clustering (SC) was the clustering technique applied. Furthermore, three different Feature Selection (FS) methods were used, Mutual Information, Bootstrap Confidence Intervals, and Lasso. The data was provided by the University Hospital of Fuenlabrada and corresponded to the period between 2004 and 2020. It included demographic data, antibiotics taken during their ICU stay, others undergoing treatments, and data relevant to the ICU neighbors. Regarding the classification of AMR, the best results were obtained with DTW and LR without applying FS, which gave an Area Under the Curve of 66%. Moreover, t-SNE and UMAP performed similarly for DTW providing well-defined groups of data that were divided into clusters by SC. Overall, with LCSS we obtained worse classification and clustering results. From the last step, we were able to differentiate two clusters from the rest clearly, one was made out of patients that come from general surgery, with the highest mean APACHE II score, indicating increased severity and with the highest consumption of antibiotics. The other one corresponded to patients having mostly either an APACHE II score of around 15 or around 30, coming from urgencies and with significantly lower consumption of antibiotics. These results could constitute a starting point for future lines of work to address the problem caused by AMR.
Description
Trabajo Fin de Grado leído en la Universidad Rey Juan Carlos en el curso académico 2022/2023. Directores/as: Cristina Soguero Ruíz, Óscar Escudero Arnanz
Collections
- Trabajos Fin de Grado [8121]
Los ítems de digital-BURJC están protegidos por copyright, con todos los derechos reservados, a menos que se indique lo contrario