The XAI Paradox: systems that perform well for the wrong reasons

Cor Steging, Lambert Schomaker, Bart Verheij

Many of the successful modern machine learning approaches can be described as ``black box'' systems; these systems perform well, but are unable to explain the reasoning behind their decisions. The emerging sub-field of Explainable Artificial Intelligence (XAI) aims to create systems that are able to explain to their users why they made a particular decision. Using artificial datasets whose internal structure is known beforehand, this study shows that the reasoning of systems that perform well is not necessarily sound. Furthermore, when multiple combined conditions define a dataset, systems can preform well on the combined problem and not learn each of the individual conditions. Instead, it often learns a confounding structure within the data that allows it to make the correct decisions. With regards to the goal of creating explainable systems, however, unsound rationales could create irrational explanations which would be problematic for the XAI movement.

Two page abstract (in PDF-format)
Manuscript (in PDF-format)

Steging, C., Schomaker, L.R.B., & Verheij, B. (2019). The XAI Paradox: systems that perform well for the wrong reasons (abstract). BNAIC/BENELEARN 2019. Proceedings of the Reference AI & ML Conference for Belgium, Netherlands & Luxemburg. Proceedings of the 31st Benelux Conference on Artificial Intelligence (BNAIC 2019) and the 28th Belgian Dutch Conference on Machine Learning (Benelearn 2019). Brussels, Belgium, November 6-8, 2019 (eds. Beuls, K. , Bogaerts, B., Bontempi, G., Geurts, P., Harley, N., Lebichot, B., Lenaerts, T., Louppe, G., & Van Eecke, P.). CEUR Workshop Proceedings.

Bart Verheij's home page - research - publications