Lancho, CarmenMartín de Diego, IsaacCuesta, MarinaAceña, VíctorM. Moguerza, Javier2024-09-022024-09-022021Lancho, C., Martín de Diego, I., Cuesta, M., Aceña, V., M. Moguerza, J. (2021). A Complexity Measure for Binary Classification Problems Based on Lost Points. In: Yin, H., et al. Intelligent Data Engineering and Automated Learning – IDEAL 2021. IDEAL 2021. Lecture Notes in Computer Science(), vol 13113. Springer, Cham. https://doi.org/10.1007/978-3-030-91608-4_14978-3-030-91607-7https://hdl.handle.net/10115/39274Complexity measures are focused on exploring and capturing the complexity of a data set. In this paper, the Lost points (LP) complexity measure is proposed. It is obtained by applying k-means in a recursive and hierarchical way and it provides both the data set and the instance perspective. On the instance level, the LP measure gives a probability value for each point informing about the dominance of its class in its neighborhood. On the data set level, it estimates the proportion of lost points, referring to those points that are expected to be misclassified since they lie in areas where its class is not dominant. The proposed measure shows easily interpretable results competitive with measures from state-of-art. In addition, it provides probabilistic information useful to highlight the boundary decision on classification problems.engComplexity measuresNeighborhood measuresBinary classificationSupervised machine learningA complexity measure for binary classification problems based on lost pointsinfo:eu-repo/semantics/bookPart10.1007/978-3-030-91608-4_14info:eu-repo/semantics/closedAccess