A Framework for Supervised Classification Performance Analysis with Information-Theoretic Methods

Fecha

2019-05-08

Título de la revista

ISSN de la revista

Título del volumen

Editor

IEEE

Enlace externo

Resumen

We introduce a framework for the evaluation of multiclass classifiers by exploring their confusion matrices. Instead of using error-counting measures of performance, we concentrate in quantifying the information transfer from true to estimated labels using information-theoretic measures. First, the Entropy Triangle allows us to visualize the balance of mutual information, variation of information, and the deviation from uniformity in the true and estimated label distributions. Next, the Entropy-Modified Accuracy allows us to rank classifiers by performance while the Normalized Information Transfer rate allows us to evaluate classifiers by the amount of information accrued during learning. Finally, if the question rises to elucidate which errors are systematically committed by the classifier, we use a generalization of Formal Concept Analysis to elicit such knowledge. All such techniques can be applied either to artificially or biologically embodied classifiers—e.g., human performance on perceptual tasks. We instantiate the framework in a number of examples to provide guidelines for the use of these tools in the case of assessing single classifiers or populations of them— whether induced with the same technique or not—either on single tasks or in a set of them. These include well-known UCI tasks and the more complex KDD cup 99 competition on Intrusion Detection.

Descripción

This work was partly supported by the Spanish Ministry of Economy & Competitiveness projects TEC2014-53390-P and TEC2017-84395-P.

Citación

F. J. Valverde Albacete and C. Peláez-Moreno. A framework for supervised clas- sification performance analysis with information-theoretic methods. IEEE Transac- tions on Knowledge and Data Engineering, 32(11):2075–2087, November 2020. doi: 10.1109/TKDE.2019.2915643.
license logo
Excepto si se señala otra cosa, la licencia del ítem se describe como Attribution-NonCommercial-NoDerivatives 4.0 International