Making AI more ‘explainable’ in health-care settings may lead to more mistakes: U of T researcher

Machine learning algorithms have the potential to provide huge benefits in health care, potentially providing more reliable diagnoses than human doctors in some cases. Yet, many of us are reluctant to entrust our health to an algorithm – especially when even its designers can’t explain exactly how it arrived at its conclusion.

This has led to a push for more transparency in AI design so that these powerful tools can explain how they came to their decisions, an idea referred to as “explainable AI.”


Marzyeh Ghassemi

But the University of Toronto’s Marzyeh Ghassemi, an assistant professor in the Temerty Faculty of Medicine and in the department of computer science in the Faculty of Arts & Science, says that explainable AI may actually make matters worse, pointing to her own research that suggests explainable AI is perceived to be more trustworthy despite being less accurate. 

“Humans tend to over-trust AI systems in two specific scenarios,” says Ghassemi, who specializes in machine learning for health and is a faculty affiliate at U of T’s Schwartz Reisman Institute for Technology and Society.

“Number one is when they believe the machine can perform a function they cannot, and number two is when there’s an expectation that the machine might mitigate some risks. Both of these are very true in many health settings, but particularly in emergency and intensive health settings.”

Thus, providing explanations of algorithmic medical recommendations could result in doctors over-trusting, and hence over-relying on, these recommendations even when they are mistaken, Ghassemi says.

Her research, which she presented at the Schwartz Reisman weekly seminar on Sept. 30, looks at how both experts and non-experts make use of medical advice they believe to be machine-generated. Experts tend to rate advice they were told was from an AI as less reliable than advice from a human, whereas non-experts tend to trust both sources of advice equally.

However, experts’ lack of trust turns out to not make much of a difference in how much they are influenced by this advice, or in the accuracy of their resulting recommendations or actions. So the experts – the same ones who were suspicious of the AI advice – were equally as likely to base their final decision on this advice as novices who rated the advice as more trustworthy.

Ghassemi’s work relates to a number of intersections between the four conversations that guide the work of the Schwartz Reisman Institute. Her work asks how AI systems can produce the maximum benefit for humanity, but also how these systems can be designed fairly so that disadvantaged communities do not bear an unfair degree of risk when the systems fail, and how they might be used to combat the bias that already exists in the medical system.

Her work, and the work of other Schwartz Reisman researchers, goes beyond the direct consequences of new technology to explore how it is integrated into existing communities, and how it might transform these communities for better or worse.

In the context of Ghassemi’s work, if AI advice is unreliable, even experienced clinicians can be misled into providing the wrong diagnosis. This becomes especially problematic when AI tools are trained on data from certain conditions, and then those conditions change.

“We don’t know what happens when, for example, a new disease comes up or a pandemic occurs,” says Ghassemi, “and suddenly [the AI is] giving the wrong predictions because we haven’t updated our models fast enough.”

Transparency might seem to be a solution to this problem, giving us the ability to know when to trust an AI and when to doubt it. But in fact, Ghassemi suggests the opposite is true. She highlights a study called “Manipulating and Measuring Model Interpretability” that showed people were actually less likely to catch obvious errors made by a transparent model (one which gave a clear weighting formula to justify its decision) than they were to catch the same errors made by a “black-box” model (one which gave no explanation for how its decision was reached.)

Ghassemi argues that we do need a certain kind of transparency, but not the kind traditionally championed by proponents of explainable AI. Instead, doctors need to know when the model is likely to be wrong – perhaps because of the population the AI system was trained on, or due to limitations in the data the system was trained on.

Another argument in favour of transparency is that it can reduce bias in decision-making. Machine learning can end up encoding biases against vulnerable populations. However, bias is clearly already a part of the medical landscape even without the use of advanced technologies. As Ghassemi puts it: “Doctors are human, and humans are biased.” And machine learning models can inherit this bias from the data they are trained on. Knowing how a model works will not automatically catch this kind of bias; it requires knowledge of both the way that machine learning models work and a knowledge of the biases present in the clinical data.

Ghassemi argues that we need aggressive audits of the machine learning model to identify potential bias, and that this is where transparency can serve a valuable function.

The overall takeaway from Ghassemi’s work is that we need to treat all advice, including AI-based advice, with appropriate suspicion because advice that seems transparent can cause overconfidence.

Instead, AI explanations can and should be used to audit and monitor these systems to catch their flaws and biases, she says.