Making Artificial Intelligence Intelligible
Humans need to know how neural networks make decisions
The irony is not lost on Kate Saenko. Now that humans have programmed computers to learn, they want to know exactly what the computers have learned, and how they make decisions after their learning process is complete.
To do that, Saenko, a Boston University College of Arts & Sciences associate professor of computer science, used humans—asking them to look at dozens of pictures depicting steps that the computer may have taken on its road to a decision, and identify its most likely path.
Those experiments worked well. The humans gave Saenko answers that made sense, but there was a problem: they made sense to humans, and humans, Saenko knew, have biases. In fact, humans don’t even understand how they themselves make decisions. How in the world then could they figure out how a neural network, with millions of neurons and billions of connections, makes decisions?
So Saenko did a second experiment, using computers instead of people to help determine exactly what learning machines learned.
“What we learned that’s really important is that, despite the extreme complexity of these algorithms, it’s possible to peek under the hood and understand their decision-making process, and that we can actually ask humans to explain it to us,” says Saenko. “So we think it’s possible to teach humans how machines make predictions.”
Computer scientists know in general terms how neural networks develop. After all, they write the training programs that direct a computer’s so-called neurons to connect to other neurons, which are actually mathematical functions. Each neuron parses one piece of information, and every neuron builds on the information in the preceding nodes. Over time, connections evolve. They go from random to revealing, and the network “learns” to do things like identify enemy stations in satellite images or spot evidence of cancer long before it is visible to a human radiologist. They identify faces. They drive cars.
That’s the good news. The disconcerting news, says Saenko, is that as artificial intelligence (AI) plays an increasingly important role in the lives of humans, its learning processes are becoming increasingly obscure. Just when we really need to trust them, they have become inscrutable. That’s a problem.
“The more we rely on artificial intelligence systems to make decisions, like autonomously driving cars, filtering newsfeed, or diagnosing disease, the more critical it is that the AI systems can be held accountable,” says Stan Sclaroff, a professor of computer science and interim dean of the College of Arts & Sciences. “One aspect of that is that all AI systems should be able to explain how they make decisions in a way that humans can understand. We should be able to see what evidence is used by an AI algorithm and examine the means by which the algorithm produced its answers or actions. It’s important for society that AI algorithms can be explainable, so that they can be held accountable for the decisions they make.”
“We have to come up with ways to evaluate explainable models of their decision-making process,” says Saenko. “We need to know that the explanation reflects the true underlying process, and that the network is not just giving us what we want to hear.”
Saenko’s first effort, which used humans to infer the network’s decision-making process by studying pictures depicting various stages, or modules, of the process, included coresearcher Trevor Darrell from the University of California, Berkeley. Saenko’s lab got $800,000 of a $7.55 million grant from the Defense Advanced Research Projects Agency (DARPA), which was awarded to both researchers.
Saenko and Darrell showed participants images of modules, steps in the computer’s decision-making process that involved the recognition of certain objects. After viewing varying series of modules, the participants, who knew what question the network had been asked but were not shown the network’s answer, were asked to predict the likelihood that the network would get the answer right. They were also asked how well they could understand the network’s internal reasoning process, and how clear (clear, mostly clear, somewhat unclear, and unclear) it was what the model was doing in each step. Saenko and Darrell reasoned that if humans predicted the model’s success or failure better than chance, then they understood at least something about the model’s decision process. Saenko’s research paper, Explainable Neural Computation via Stack Neural Module Networks, was presented September 11 at the European Conference on Computer Vision in Munich.
“What I’m really interested in,” she says, “is if humans can understand how machines work, especially with such complex algorithms. At this point there is no true explanation of why the network made the decision. For example, to decide if an image contains a dog, the network could be looking for ears, eyes, and tail. ‘Because I found two eyes and a tail’ is the explanation. If we had this gold standard explanation, then we could compare other explanations to it, to evaluate how correct they are. But the problem is how to get this truth. We could ask a human to guess it, but they’ll probably just guess how they would make that decision. So we need to come up with other ways of evaluating the explanations. We don’t know how a neural network really thinks, except to write down all of the millions of mathematical computations it is doing. But that’s not useful to a human user.”
Saenko says there’s one big reason why that won’t be easy. “It’s the same reason that we don’t understand how people think,” she says. “You could ask me why I wore this shirt today and I could come up with some rationalization, but who knows how my thinking really works? I don’t know what my brain process was really like.”
“We can rationalize how we think,” she says, “and we can teach machines to rationalize how they think. For example, we can ask it why it thinks something is a dog and it will say, ‘Oh, because it has ears and a tail and fur,’ but that may not actually be the reason that it predicted dog. There could be some other reason. Maybe it learned that all white objects are dogs.”
With simple machines, she says, machines that use basic decision trees, humans can easily explain what’s going on. But when there are millions of operations involved in a decision, researchers need a more abstract way of explaining things.
“That’s what we’re doing. We are finding an abstract way to explain it,” she says. “We are trying to learn if the process really reflects the underlying decision. If it does, then humans should be able to predict what’s going to happen next. We count how many times the human annotators were able to predict if the machine got the answer right or wrong and we compare that with previous methods explaining neural networks. We compare which learning models and previous neural networks lead to a higher accuracy in predicting what the model will say.”
Saenko’s work with humans is one way to validate the decision-making process, but, she says, because it does involve humans, it is subject to human biases. And those biases may fail to recognize the merit of a neural network’s processes. “Let’s say we have a neural network that learned “woman” whenever it saw a kitchen. That would be a logical decision if most of the pictures of kitchen it was trained on had a woman in them. Now, if we had a very good explanation of that model it would understand that this is why the network said “woman.” But if you ask a human to evaluate that, the human would say that’s a terrible explanation. The [focus] should be on the woman. But the network actually has a good explanation. It’s just that the model is not making a decision the same way a human would make it. So a biased human might say, ‘That’s not how I would make a decision, so it’s incorrect.’ The human would be wrong.”
To avoid such problems, Saenko, along with PhD student Vitali Petsiuk (GRS’24) and postdoctoral researcher Abir Das, designed a second set of experiments, also funded by DARPA, that relied solely on computers. That research paper, RISE: Randomized Input Sampling for Explanation of Black-box Models, was presented on September 4 at the 29th British Machine Vision Conference at Northumbria University.
“This time we didn’t have any humans in the loop,” says Saenko. “Instead, we had another computer program evaluate the first program’s explanations. The experiment works like this: The first program, the neural network, provides an explanation of why it made the decision by highlighting parts of the image that it used as evidence. The second program, the evaluator, uses this to obscure the important parts, and feeds the obscured image back to the first program. If the first program can no longer make the same decision, then the obscured parts were actually important, and the explanation is a good one. However, if it still makes the same decision, even with the obscured regions, then the explanation is judged to be insufficient.”
Which method does a better job of explaining a network’s decision-making process? Saenko is reluctant to pick a winner. “I would say that we don’t know which is better because we need both kinds of evaluations,” she says. “The computer doesn’t have human biases, so it’s a better evaluator in that sense. But we still do the evaluation with humans in the loop because in the end we know how humans interact with the machine.”
The more important question, says Saenko, is this: “Does this type of evaluation increase human trust in neural networks? Does it improve a human experience or improve the performance? Let’s say if you had a self-driving car and it could explain why it is driving a certain way, would it actually help you?
“I would say ‘yes,’” says Saenko. “But I would also say we need a lot more research.”
Commenting on this quote: But if you ask a human to evaluate that, the human would say that’s a terrible explanation. The [focus] should be on the woman. But the network actually has a good explanation. It’s just that the model is not making a decision the same way a human would make it. So a biased human might say, ‘That’s not how I would make a decision, so it’s incorrect.’ The human would be wrong.”
That’s a misperception. There are so many misogynist humans that would come to the very same conclusion. They see a kitchen and one of their thoughts might be “women belong to the kitchen”. Why? Because they’ve been trained to associate that. We’re setting some kind of weird gold standard to AI, as if humans and human decision-making processes were perfect and foolproof. But they are not. Humans, just like AI, depend on their training data. It’s just that we humans call it “upbringing” and “social background”. If you train a neural network with images of kitchens that only have women in them, then – OF COURSE – you are creating a mysoginist network. Because your training data was wrong.
If you take a human that doesn’t have any knowledge, and show it a million images of kitchens with women in them, and then ask him to determine the meaning, the human would probably come to a similar conclusion.