Hostility measure for multi-level study of data complexity

dc.contributor.authorLancho, Carmen
dc.contributor.authorMartín De Diego, Isaac
dc.contributor.authorCuesta, Marina
dc.contributor.authorAceña, Víctor
dc.contributor.authorMoguerza, Javier M.
dc.date.accessioned2023-09-19T14:54:43Z
dc.date.available2023-09-19T14:54:43Z
dc.date.issued2022
dc.descriptionThis research has been supported by grants from Rey Juan Carlos University (Ref: C1PREDOC2020), Madrid Autonomous Community (Ref: IND2019/TIC-17194) and the Spanish Ministry of Science and Innovation, under the Retos-Investigación program: MODAS-IN (Ref: RTI-2018-094269-B-I00). We would like to thank Antonio Alonso Ayuso and Emilio López Cano from the Data Science Laboratory at Rey Juan Carlos University for the checking of mathematics. Funding Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. Author informationes
dc.description.abstractComplexity measures aim to characterize the underlying complexity of supervised data. These measures tackle factors hindering the performance of Machine Learning (ML) classifiers like overlap, density, linearity, etc. The state-of-the-art has mainly focused on the dataset perspective of complexity, i.e., offering an estimation of the complexity of the whole dataset. Recently, the instance perspective has also been addressed. In this paper, the hostility measure, a complexity measure offering a multi-level (instance, class, and dataset) perspective of data complexity is proposed. The proposal is built by estimating the novel notion of hostility: the difficulty of correctly classifying a point, a class, or a whole dataset given their corresponding neighborhoods. The proposed measure is estimated at the instance level by applying the k-means algorithm in a recursive and hierarchical way, which allows to analyze how points from different classes are naturally grouped together across partitions. The instance information is aggregated to provide complexity knowledge at the class and the dataset levels. The validity of the proposal is evaluated through a variety of experiments dealing with the three perspectives and the corresponding comparative with the state-of-the-art measures. Throughout the experiments, the hostility measure has shown promising results and to be competitive, stable, and robust.es
dc.identifier.citationLancho, C., Martín De Diego, I., Cuesta, M. et al. Hostility measure for multi-level study of data complexity. Appl Intell 53, 8073–8096 (2023). https://doi.org/10.1007/s10489-022-03793-wes
dc.identifier.doi10.1007/s10489-022-03793-wes
dc.identifier.issn1573-7497
dc.identifier.urihttps://hdl.handle.net/10115/24383
dc.language.isoenges
dc.publisherSpringeres
dc.rightsAtribución 4.0 Internacional*
dc.rights.accessRightsinfo:eu-repo/semantics/openAccesses
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.subjectHostility measurees
dc.subjectComplexity measureses
dc.subjectData complexityes
dc.subjectClassificationes
dc.subjectSupervised problemses
dc.titleHostility measure for multi-level study of data complexityes
dc.typeinfo:eu-repo/semantics/articlees

Archivos

Bloque original

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
s10489-022-03793-w.pdf
Tamaño:
3.97 MB
Formato:
Adobe Portable Document Format
Descripción:

Bloque de licencias

Mostrando 1 - 1 de 1
No hay miniatura disponible
Nombre:
license.txt
Tamaño:
2.67 KB
Formato:
Item-specific license agreed upon to submission
Descripción: