Examinando por Autor "Martín De Diego, Isaac"

Mostrando 1 - 2 de 2

General Performance Score for classification problems
(Springer, 2021) Martín De Diego, Isaac; Redondo, Ana R.; Fernández, Rubén R.; Navarro, Jorge; Moguerza, Javier M.
Several performance metrics are currently available to evaluate the performance of Machine Learning (ML) models in classifcation problems. ML models are usually assessed using a single measure because it facilitates the comparison between several models. However, there is no silver bullet since each performance metric emphasizes a diferent aspect of the classifcation. Thus, the choice depends on the particular requirements and characteristics of the problem. An additional problem arises in multi-class classifcation problems, since most of the well-known metrics are only directly applicable to binary classifcation problems. In this paper, we propose the General Performance Score (GPS), a methodological approach to build performance metrics for binary and multi-class classifcation problems. The basic idea behind GPS is to combine a set of individual metrics, penalising low values in any of them. Thus, users can combine several performance metrics that are relevant in the particular problem based on their preferences obtaining a conservative combination. Diferent GPS-based performance metrics are compared with alternatives in classifcation problems using real and simulated datasets. The metrics built using the proposed method improve the stability and explainability of the usual performance metrics. Finally, the GPS brings benefts in both new research lines and practical usage, where performance metrics tailored for each particular problem are considered.
Hostility measure for multi-level study of data complexity
(Springer, 2022) Lancho, Carmen; Martín De Diego, Isaac; Cuesta, Marina; Aceña, Víctor; Moguerza, Javier M.
Complexity measures aim to characterize the underlying complexity of supervised data. These measures tackle factors hindering the performance of Machine Learning (ML) classifiers like overlap, density, linearity, etc. The state-of-the-art has mainly focused on the dataset perspective of complexity, i.e., offering an estimation of the complexity of the whole dataset. Recently, the instance perspective has also been addressed. In this paper, the hostility measure, a complexity measure offering a multi-level (instance, class, and dataset) perspective of data complexity is proposed. The proposal is built by estimating the novel notion of hostility: the difficulty of correctly classifying a point, a class, or a whole dataset given their corresponding neighborhoods. The proposed measure is estimated at the instance level by applying the k-means algorithm in a recursive and hierarchical way, which allows to analyze how points from different classes are naturally grouped together across partitions. The instance information is aggregated to provide complexity knowledge at the class and the dataset levels. The validity of the proposal is evaluated through a variety of experiments dealing with the three perspectives and the corresponding comparative with the state-of-the-art measures. Throughout the experiments, the hostility measure has shown promising results and to be competitive, stable, and robust.