Explanation Sets: A framework for Machine Learning explicability
Fecha
2023
Autores
Título de la revista
ISSN de la revista
Título del volumen
Editor
Universidad Rey Juan Carlos
Resumen
The term Machine Learning (ML) was coined by Arthur Samuel in 1959. Since then,
more than sixty years have passed, and ML has evolved enormously, especially in the last
decade. From the early days of ML, when it was primarily a research topic, to today, when
we interact with ML systems on a daily basis, often without even realizing it, we have
come a long way. Although the explainability of these ML systems has been considered
since their inception, it has become more important than ever due to their integration into
our daily lives. Explainable ML addresses this issue, aiming to make predictive models
and their decisions understandable to humans.
There are several Explainable ML techniques, each with its own goals and scopes.
For example, the scope of a technique can be either global, addressing the entire model,
or local, focusing on a specific region of interest. While the choice of the technique
depends on several factors, the main driving factor is the user, specifically their cognitive
biases and what they expect from the system. These preferences and the different types of
explanations have been extensively studied in the social sciences. Among these techniques,
we emphasize counterfactuals and semifactuals, which have also been incorporated into
Explainable ML. They are a contrastive explanation where the user reasons about the
differences between the observation of interest and a hypothetical observation that led to
the same prediction (semifactual) or a different prediction (counterfactuals). However,
within the context of ML, they face some limitations. Both are mainly defined in a
classification context and lack a standardized mechanism to enforce user preferences.
Counterfactuals typically rely on a single observation, whereas semifactuals do not have
a general definition and are associated with different terms.
This thesis introduces the Explanation Set framework to address these limitations.
The Explanation Sets framework is an approach that unifies counterfactuals and semifactuals
through similarity measures and provides users with mechanisms to specify their
preferences via a feasible set. Besides providing a unified framework, the definitions based
on similarity measures enable the seamless extension of counterfactuals and semifactuals
to other tasks, like regressions, by using appropriate similarities. A review of how various
techniques from the literature fit this framework is incorporated. The proposed approach
was successfully validated in regression and classification tasks, showing how different feasible sets and similarity measures produce different explanations.
We also introduce two methods to extract Explanation Sets: Anchor_ES and Random
Forest Optimal Counterfactual Set Extractor (RF-OCSE). Anchor_ES expands upon the
Anchor method, allowing for user-defined similarity measures and including a feasible set.
On the other hand, RF-OCSE is a method to extract counterfactual Explanation Sets
from a Random Forest (RF). It involves a partial fusion of Decision Trees (DTs) from
a RF into a single DT using a modification of the Classification and Regression Trees
(CART) algorithm. The proposed extraction methods were validated through several
experiments against existing alternatives on several well-known datasets. The evaluation
metrics measure aspects correlated with the quality of the explanations, including the
percentage of valid counterfactuals, distance to the factual sample, method stability, and
counterfactual set quality. RF-OCSE was the only method supporting set explanations
that always yielded valid explanations and took, on average, significantly less time than
the alternatives. Conversely, Anchor_ES obtained a good compromise between the fidelity
and the coverage, and it emerges as a viable alternative, especially when full access
to the model is not possible.
In conclusion, we introduce a novel explainability framework that empowers users to
tailor explanations to their preferences. Explanation Sets pave the way for incorporating
new preferences not currently recognized in the literature in a unified and standardized
manner. This simplifies their eventual incorporation into extraction methods. Regarding
the extraction methods, we noticed a significant disparity in quality between methods
that utilize the internal structure of the model and those that use models as black-boxes,
motivating the benefits of the former approach when possible.
Descripción
Tesis Doctoral leída en la Universidad Rey Juan Carlos de Madrid en 2023. Directores:
Isaac Martín de Diego,
Javier Martínez Moguerza
Palabras clave
Citación
Colecciones
Excepto si se señala otra cosa, la licencia del ítem se describe como Attribution-NonCommercial-NoDerivatives 4.0 Internacional