Interpretable clinical time-series modeling with intelligent feature selection for early prediction of antimicrobial multidrug resistance
Electronic health records provide rich, heterogeneous data about the evolution of the patients’ health status. However, such data need to be processed carefully, with the aim of extracting meaningful information for clinical decision support. In this paper, we leverage interpretable (deep) learning and signal processing tools to deal with multivariate time-series data collected from the Intensive Care Unit (ICU) of the University Hospital of Fuenlabrada (Madrid, Spain). The presence of antimicrobial multidrug-resistant (AMR) bacteria is one of the greatest threats to the health system in general and to the ICUs in particular due to the critical health status of the patients therein. Thus, early identification of bacteria at the ICU and early prediction of their antibiotic resistance are key for the patients’ prognosis. While intelligent data-based processing and learning schemes can contribute to this early prediction, their acceptance and deployment in the ICUs require the automatic schemes to be not only accurate but also understandable by clinicians. Accordingly, we have designed trustworthy intelligent models for the early prediction of AMR based on the combination of meaningful feature selection with interpretable recurrent neural networks. These models were created using irregularly sampled clinical measurements, both considering the health status of the patient and the global ICU environment. We explored several strategies to cope with strongly imbalance data, since only a few ICU patients are infected by AMR bacteria. It is worth noting that our approach exhibits a good balance between performance and interpretability, especially when considering the difficulty of the classification task at hand. A multitude of factors are involved in the emergence of AMR (several of them not fully understood), and the records only contain a subset of them. In addition, the limited number of patients, the imbalance between classes, and the irregularity of the data render the problem harder to solve. Our models are also enriched with SHAP post-hoc interpretability and validated by clinicians who considered model understandability and trustworthiness of paramount concern for pragmatic purposes. Moreover, we use linguistic fuzzy systems to provide clinicians with explanations in natural language. Such explanations are automatically generated from a pool of interpretable rules that describe the interaction among the most relevant features identified by SHAP. Notice that clinicians were especially satisfied with new insights provided by our models. Such insights helped them to trust the automatic schemes and use them to make (better) decisions to mitigate AMR spreading in the ICU. All in all, this work paves the way towards more comprehensible time-series analysis in the context of early AMR prediction in ICUs and reduces the time of detection of infectious diseases, opening the door to better hospital care.
This work is supported by the Spanish NSF grants PID2019-106623RB-C41 (BigTheory), PID2019-105032GB-I00 (SPGraph), PID2019-107768RA-I00 (AAVis-BMR), RTI2018-099646-B-I00 (ADHERE-U); the Galician Ministry of Education, University and Professional Training grants ED431F 2018/02 (eXplica-IA) and ED431G2019/04; the Instituto de Salud Carlos III, Spain grant DTS17/00158; as well as the Community of Madrid in the framework of the Multiannual Agreement with Rey Juan Carlos University in line of action 1, “Encouragement of Young Phd students investigation” Project Ref. F661 (Mapping-UCI). Sergio M. Aguero is a recipient of the Predoctoral Contracts for Trainees URJC Grant (PREDOC21-036). Jose M. Alonso-Moral is a Ramon Cajal Researcher (RYC-2016-19802).
- Artículos de Revista