AI regulation and the right to meaningful explanation. Pt 2: How (if at all)?
25 November 2024Earlier, I argued that we should safeguard the right to a meaningful explanation of life-changing decisions, even when such decisions are made by complex AI algorithms. Here, I argue that the complexity of such algorithms does not automatically preclude providing such explanations.
We are used to getting crummy and evasive explanations: ‘because that’s how it’s always been’, ‘because I said so’, ‘computer says no’, and of course the eternally available ‘just because’. Now that the rise of AI decision-making is fueling a demand to entrench a ‘right to a meaningful explanation’, it is worth spending some time on what makes explanations crummy or evasive, rather than meaningful.
Explanations can be bad in a variety of ways. They can focus on details that are of no interest to you: ‘Neuron 164 in the algorithm sent a signal to Neuron 167, which eventually led to the selection of another candidate’. They can leave out details that are of interest to you: ‘we went with a candidate better suited to our current needs’. Or they can simply be false, as is often the case when we say, ‘it’s not you, it’s me’.
A good explanation needs to be truthful and contain the right amount of relevant detail. A key guideline is to find the ideal point of intervention. What kind of change to the input of a decision algorithm would have led to the desired change in the outcome? If the recruitment algorithm would have ranked my application more favorably if my CV demonstrated more work experience in the service industry, then getting more work experience is the relevant point of intervention for having my application ranked better. Thus, my lack of work experience is a good explanation for my poor showing in the rankings delivered by the recruitment algorithm. The explanation provides me with a transparent reason and a clear guide to improving my application.
The ‘ideal’ in ‘ideal point of intervention’ is a frustrating qualifier. Among many other things, the feasibility of an intervention is dependent on personal circumstances. The mortgage would be available if I doubled my down payment? Cold comfort for most, but perhaps a phone call away for some. My application would do better if I appeared more sociable in the job interview? A matter of practice for most, but nigh unfeasible for those with crushing social anxiety or with too much riding on the next interview. Even setting aside the truly frustrating cases where the required interventions are in fact impossible – get rid of your criminal record! – we should be worried about how to legally entrench something as context-dependent as a right to ‘good’ or ‘meaningful’ explanation.
These are legitimate worries, and we have a sizeable portion of work cut out for us. But these worries should be distinguished from a more panicked reaction: that the mere possibility of providing any explanations of AI decisions is just a mirage.
The panic results from a natural line of thinking. Many AI algorithms are unfathomably complex, such that following each step in its decision-making procedure is impossible for cognitively limited creatures like us. How then, are we to foster any hope at an explanation of their outcomes?
This line of reasoning is often levied against current techniques for explaining AI decisions. Such so-called ‘forensic’ techniques do not trace each step in the AI algorithm. Instead, they make predictions of how the algorithm would have decided if it had been fed slightly different information: would candidate A had been hired if her application showed strong grasp of Spanish? Would candidate B not have gotten the job if their CV made no mention of their previous work experience? If the predicted answer to these questions is yes, then the lack of exhibited Spanish skill explains why A did not get hired, and B’s previous work experience explains why they got hired. These strategies align with our ‘ideal point of intervention’ guideline above: by getting more accurate predictions of this kind, we can identify the ideal point of intervention.
The complaint goes that the use of these techniques provides only ‘post hoc’ explanations: after the decision is made, they make some predictions about how the decision could have gone differently. But, no matter how these techniques work, they do not follow the AI algorithm step by step. That, as said, would be so cumbersome as to be impossible. So these techniques, no matter how they work, are building on a massive simplification of the algorithm, and thus cannot provide a truthful picture of what explains their outcomes.[2]
Certainly, the name ‘post hoc explanation’ raises some red flags. Most of us know the phrase ‘post hoc’ from fallacies such as ‘post hoc ergo propter hoc’[3] and ‘post hoc rationalization’, where more palatable explanations are made up on the fly to cover up the true, less acceptable explanations.[4] But this superficial similarity in naming tells us nothing about why building explanations on simplified models should be a problematic practice.
Often enough, I can truthfully explain that the engine died because I forgot to press the clutch pedal (I’m a terrible driver). The model underlying my explanation simplifies the operative mechanism tremendously by leaving out physical, chemical, and mechanical details that are crucial to the workings of my car, but I provide a genuine explanation nonetheless.
In science, too, our models and laws systematically omit details. The ideal gas law neglects molecular size and intramolecular attraction, but we can use it to explain the behavior of gasses nonetheless. Much of our engineering practices still rely on Newtonian physics, which ignores all the nuances that subsequent physics research has revealed. And yet, we do not take engineers to be incapable of explaining why a bridge will collapse if it is loaded with more than x tons.
The forensic methods used to deliver so-called post hoc explanations use the simplification methods common in scientific modelling.[5] Then why oppose them? The obvious worry is that, by their simplifications, forensic methods stand a chance of making mistakes. And indeed, sometimes the details omitted are in fact crucial to the outcome and the technique provides a faulty explanation.[6] This does not mean, however, that it fails to provide a truthful explanation in the cases where the omitted details are in fact irrelevant.
The same is true in scientific explanations. If the ideal gas law is applied to a case where the intramolecular attraction is crucial to the behaviour of a gas, it will make wrong predictions and provide wrong explanations. It might predict, for example, that a gas explosion was avoidable by reducing the temperature, when in fact it was not. That does not mean that there is something wrong with all the other explanations the law provides in situations where the intramolecular attraction was in fact irrelevant and it does lead us towards the correct point of intervention. Similarly, even if our forensic methods misfire in some cases by omitting relevant details, they can still provide truthful explanations when they do not omit relevant details. The culprit is not the omission of details per se, but the omission of details that affect which point of intervention is selected.
Admittedly, some of the panic lingers. How are we to know when these methods make mistakes? How are these methods deciding which details are omittable? These are reasonable concerns, and an entire branch of AI research is dedicated to addressing them.[7] Rather than panicking, we can remind ourselves that, in the areas of science and everyday explanations, we have made things work under similar circumstances. The challenge of explaining AI outcomes is located on the rough but steady terrain of hard and difficult work, rather than the immeasurable vacuum of impossibility.
[1] Reference to previous blog post.
[2] https://arxiv.org/abs/1811.10154 (accessed July 28th, 2024)
[3] https://en.wikipedia.org/wiki/Post_hoc_ergo_propter_hoc (accessed July 28th, 2024)
[4] https://psycnet.apa.org/doiLanding?doi=10.1037%2F0033-295X.108.4.814 (accessed July 28th, 2024)
[5] https://www.cambridge.org/core/journals/episteme/article/understanding-idealization-and-explainable-ai/635566D3074E7F300EEAC746BA7249E4 (accessed July 28th, 2024)
[6] https://hbaniecki.com/adversarial-explainable-ai/ (accessed July 28th, 2024)
[7] https://en.wikipedia.org/wiki/Explainable_artificial_intelligence
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- October 2019
- September 2019
- August 2019
- July 2019
- June 2019
- May 2019
- April 2019
- March 2019
- February 2019
- January 2019
- December 2018
- November 2018
- October 2018
- September 2018
- August 2018
- July 2018
- June 2018
- May 2018
- April 2018
- March 2018
- February 2018
- January 2018
- December 2017
- November 2017
- October 2017
- September 2017
- August 2017
- July 2017
- June 2017
- May 2017