Examinando por Autor "Ruiz-Llorente, Sergio"

Mostrando 1 - 4 de 4

A Resampling Univariate Analysis Approach to Ovarian Cancer from Clinical and Genetic Data
(IEEE, 2021-02-08) Bote-Curiel, Luis; Ruiz-Llorente, Sergio; Muñoz-Romero, Sergio; Yagüe-Fernández, Mónica; Barquín, Arantzazu; García-Donas, Jesús; Rojo-Álvarez, José Luis
Ovarian cancer (OC) is the second most common gynecological malignancy and the gynecological tumor with the worst prognosis. To try to improve this situation, Data Science technologies could be a useful tool to help clinicians to know more about the disease. In our case, we are interested in exploring OC data to discover relationships between clinical and genetic factors and the disease progression. For it, we propose an analysis framework for simple and univariate statistical descriptions of features of different types, based on bootstrap resampling. Foremost, we define the framework for metric, categorical, and dates variables and determine what are the advantages and disadvantages of using different bootstrap resampling strategies, based on their statistical basis. Then, we use it to perform a univariate analysis over an OC dataset that allows to explore how is the disease progression, having platinum-free interval as indicator, in relation to clinical and genetic features of different types. Also, it provides a first set of variables possibly relevant for survival prediction. Results obtained show that some features have led to individual differences between both platinum resistant (<; 6 months) and platinum sensitive(>6 months) groups. It can be concluded that this could be an indicator that the database could be discriminatory for the hypotheses studied, though it is convenient to make multivariate analyses to check how relationships among features are influenced.
Multivariate feature selection and autoencoder embeddings of ovarian cancer clinical and genetic data
(Elsevier, 2022) Bote-Curiel, Luis; Ruiz-Llorente, Sergio; Muñoz-Romero, Sergio; Yagüe-Fernández, Mónica; Barquín, Arantzazu; García-Donas, Jesús; Rojo-Álvarez, José Luis
Although certain genetic alterations have been defined as predictive and prognostic biomarkers in the context of ovarian cancer (OC), data science methods represent alternative approaches to identify novel correlations and define relevant markers in these gynecological tumors. Considering this potential, our work focused both on clinical and genomic data information collected from patients with OC to identify relationships between clinical and genetic factors and disease progression-related variables. For this aim, we proposed two analyses: (1) a nonlinear exploration of an OC dataset using autoencoders, a type of neural network that can be used as a feature extraction tool to represent a dataset in 3-dimensional latent space, so that we could assess whether there are intrinsic or natural nonlinear separability patterns between disease progression groups (in our case, platinum-sensitive and platinum-resistant groups); and (2) the identification of relevant variable relationships by means of an adaptation of the informative variable identifier (IVI), a feature selection method that labels each input feature as informative or noisy with respect to the task at hand, identifies the relationships among features, and builds a ranking of features, allowing us to study which input features and relationships may be most informative for the OC disease progression classification to define new biomarkers involved in disease progression. Our interest has been in clinical and genetic factors and in the combination of clinical features and genetic profile. Results with autoencoders suggest a pattern of separability between disease progression groups in the clinical part and for the combination of genes and clinical features of the OC dataset, that is increased via supervised fine tuning. In the genetic part, this pattern of separability is not observed, but it is more defined when a supervised fine tuning is performed. Results of the IVI-mediated feature selection method show significance for relevant clinical variables (such as type of surgery and neoadjuvant chemotherapy), some mutation genes, and low-risk genetic features. These results highlight the efficacy of the considered approaches to better understand the clinical course of OC.
Multivariate Feature Selection and Autoencoder Embeddings of Ovarian Cancer Clinical and Genetic Data
(Elsevier Ltd, 2022) Bote-Curiel, Luis; Ruiz-Llorente, Sergio; Muñoz-Romero, Sergio; Yagüe-Fernández, Mónica; Barquín, Arantzazu; García-Donas, Jesús; Rojo-Álvarez, José Luis
Although certain genetic alterations have been defined as predictive and prognostic biomarkers in the context of ovarian cancer (OC), data science methods represent alternative approaches to identify novel correlations and define relevant markers in these gynecological tumors. Considering this potential, our work focused both on clinical and genomic data information collected from patients with OC to identify relationships between clinical and genetic factors and disease progression-related variables. For this aim, we proposed two analyses: (1) a nonlinear exploration of an OC dataset using autoencoders, a type of neural network that can be used as a feature extraction tool to represent a dataset in 3-dimensional latent space, so that we could assess whether there are intrinsic or natural nonlinear separability patterns between disease progression groups (in our case, platinum-sensitive and platinum-resistant groups); and (2) the identification of relevant variable relationships by means of an adaptation of the informative variable identifier (IVI), a feature selection method that labels each input feature as informative or noisy with respect to the task at hand, identifies the relationships among features, and builds a ranking of features, allowing us to study which input features and relationships may be most informative for the OC disease progression classification to define new biomarkers involved in disease progression. Our interest has been in clinical and genetic factors and in the combination of clinical features and genetic profile. Results with autoencoders suggest a pattern of separability between disease progression groups in the clinical part and for the combination of genes and clinical features of the OC dataset, that is increased via supervised fine tuning. In the genetic part, this pattern of separability is not observed, but it is more defined when a supervised fine tuning is performed. Results of the IVI-mediated feature selection method show significance for relevant clinical variables (such as type of surgery and neoadjuvant chemotherapy), some mutation genes, and low-risk genetic features. These results highlight the efficacy of the considered approaches to better understand the clinical course of OC.
Text Analytics and Mixed Feature Extraction in Ovarian Cancer Clinical and Genetic Data
(IEEE, 2021) Bote-Curiel, Luis; Ruiz-Llorente, Sergio; Muñoz-Romero, Sergio; Yagüe-Fernández, Mónica; Barquín, Arantzazu; García-Donas, Jesús; Rojo-Álvarez, José Luis
Developments of richer integrative analysis methods for oncological studies are needed for efficiently leveraging the amount of clinical and genetic data available to provide the clinicians with better information. However, analyses of this nature often require mixing data of different types, which are not immediate to address jointly with classical methods. In this work, our aim is to find relationships between clinical and genetic features of different types (metric, categorical, and text) and the ovarian cancer (OC) disease progression. To this end, we first propose a univariate statistical method for text type applying bootstrap resampling to Bag of Words and Latent Dirichlet Allocation in order to include as features the free-text fields of the health recordings. Secondly, we extend bootstrap resampling for metric and categorical feature extraction with Principal Component Analysis (PCA) and Multiple Correspondence Analysis (MCA), respectively. We subsequently formulate a novel and integrative method for jointly considering metric, categorical, and text features. Results obtained in text analysis indicate individual differences in some words between two OC patients groups categorised according to their sensitivity to platinum drugs. These results indicate separability between both groups for text features. Also, regarding the multivariate analysis, clinical data results showed separability patterns for the three methods analysed according to the platinum-sensitivity degree. The use of these analytical tools in our OC cohort has allowed us to demonstrate their strengths by confirming the predictive and prognostic role of widely-known clinical and genetic variables (BRCA status, value of adjuvant therapy and optimal resection, or family history) and demonstrating significant associations in other variables whose role in OC development has been studied to a lesser extent (such as PMS1, GPC3, and SLX4 genes). These results highlight the value of implementing these approaches for the identification of novel biomarkers in the context of OC.