Explainable artificial intelligence, or XAI, is a set of techniques, principles and processes that aim to help AI developers and users alike better understand AI models, both in terms of their algorithms and the outputs generated by them. Explainable AI can be used to describe an AI model, its expected impact and any potential biases, as well as assess its accuracy and fairness. As artificial intelligence becomes more advanced, many consider explainable AI to be essential to the industry’s future.

AI has come a long way in recent years. These days, large language models like OpenAI’s ChatGPT are capable of producing anything from working computer code to accurate medical diagnoses. And deep learning powers common technology like virtual assistants and TV streaming services, as well as cutting edge innovations like self-driving cars and deepfakes.

What Is Explainable AI?

Explainable AI is a set of techniques, principles and processes that aim to help AI developers and users alike better understand AI models, both in terms of their algorithms and the outputs generated by them.

Amid the growing sophistication and adoption of AI, there remains one major ongoing problem: People don’t understand why AI models make the decisions they do — even the researchers and developers who are creating them. AI algorithms often operate as black boxes, meaning they take inputs and provide outputs with no way to understand their inner workings. How these models arrive at their conclusions, the data they use and the trustworthiness of their results are not easy questions to answer.

“We train these large, complicated models to perform various tasks. And they turn out to be able to do them, at least in the sense of achiev[ing] high predictive accuracy fairly reliably. But we don’t really understand what they’re doing, or what it tells us about the underlying systems,” Zachary Lipton, an assistant professor of machine learning and operations research at Carnegie Mellon University, told Built In.


Why Does Explainable AI Matter?

Addressing these questions, and more, is the essence of explainable AI. And, lately, there’s been a surge of interest in its potential to make the complexity of AI models more manageable and understandable, with the aim of building trust and confidence among their users. Over the years, major tech giants like IBM, Google and Microsoft have rolled out their own explainable AI resources and toolkits, along with other companies.


AI Has Become More Ubiquitous In Everyday Life

The notion of explainable AI is not new. According to Upol Ehsan, an explainable AI researcher at Georgia Tech, attempts to make artificial intelligence more explainable began in the 1980s with the popularization of technology like AI-powered expert systems, which use databases of expert knowledge to make decisions and offer advice. Explainable AI’s popularity then resurged once deep learning systems became sophisticated enough to be productized.

Since then, artificial intelligence has seeped into virtually every facet of society, from healthcare to finance to even the criminal justice system. Generative adversarial networks, or GANS, have become a go-to resource in medical imaging, helping doctors detect brain tumors and other diseases. And police departments around the world have begun using complex facial recognition systems to help identify suspects.

“It’s getting applied for the purposes of increased performance and for increased efficiency in areas where you would have never thought there was AI involved,” Joshua Rubin, director of data science at AI model performance management startup Fiddler AI, told Built In. “It’s getting used more, and the kind of models that are being used are less intrinsically understandable.”


The Decisions AI Models Make Can Have Deep (Sometimes Harmful) Societal Ramifications

This can have deep ramifications for people on the receiving end of those models. Facial recognition software used by some police departments has been known to lead to false arrests of innocent people. People of color seeking loans to purchase homes or refinance have been overcharged by millions due to AI tools used by lenders. And many employers use AI-enabled tools to screen job applicants, many of which have proven to be biased against people with disabilities and other protected groups.

And just because a problematic algorithm has been fixed or gotten rid of, doesn’t mean the harm it has caused goes away, too. Rather, harmful algorithms are “palimpsestic,” Ehsan told Built In — their traces remain, leaving what he calls algorithmic imprints. “There is no undo button. Even though algorithms are made of software, and we think of software as very malleable [and] deletable, the effects of these algorithmic systems are anything but that — they leave hard and persistent imprints on society and human lives.”

“The models are becoming more powerful. They are becoming more opaque. Now, there is a higher need for explainability.”

“How can you hold something accountable if you cannot ask the question, ‘Why?’ Therein lies this tension that we face in the industry today,” Ehsan said. “The models are becoming more powerful. They are becoming more opaque. Now, there is a higher need for explainability.”


AI Is Getting More Regulated, Requiring More Industry Accountability

As governments around the world continue working to regulate the use of artificial intelligence, explainability in AI will likely become even more important.

In the United States, for instance, President Joe Biden and his administration created an AI “bill of rights,” which includes guidelines for protecting personal data and limiting surveillance, among other things. And the Federal Trade Commission has been monitoring how companies collect data and use AI algorithms.

Meanwhile, after numerous amendments and much discussion, the Council of the European Union approved a compromise version of its proposed AI Act, and the Parliament is scheduled to put it to vote later this year. Once it is in place, this will reportedly be the world’s first broad standard for regulating or banning certain uses of artificial intelligence. The EU is also working on a new law to hold companies accountable when their AI products cause harm, such as privacy infringements or biased algorithms.

Looking ahead, explainable AI could help mitigate any compliance, legal and security risks with a particular AI model.

More on Explainability in AI Weighing the Trade-Offs of Explainable AI


How Does Explainable AI Work?

Typically, explainable AI seeks to explain one or more of the following things: The data used to train the model (including why it was chosen), the predictions made by the model (and what specifically was considered in reaching that prediction) and role of the algorithms used in the model. 

In the context of machine learning and artificial intelligence, explainability is the ability to understand “the ‘why’ behind the decision-making of the model,” Rubin said. Therefore, explainable AI requires “drilling into” the model in order to extract an answer as to why it made a certain recommendation or behaved in a certain way.

Researchers have developed several different approaches to explain AI systems, which are commonly organized into two broad categories: self-interpretable models and post-hoc explanations.

Self-interpretable models are, themselves, the explanations, and can be directly read and interpreted by a human. Put simply, the model is the explanation. Some of the most common self-interpretable models include decision trees and regression models, including logistic regression.

Meanwhile, post-hoc explanations describe or model the algorithm to give an idea of how said algorithm works. These are often generated by other software tools, and can be used on algorithms without any inner knowledge of how that algorithm actually works, so long as it can be queried for outputs on specific inputs.

One commonly used post-hoc explanation algorithm is called LIME, or local interpretable model-agnostic explanation, which takes decisions and, by querying nearby points, builds an interpretable model that represents the decision, then uses that model to provide explanations. Another is Shapley additive explanation, or SHAP, which explains a given prediction by mathematically computing how each feature contributed to the prediction.

Explanations can also be formatted in different ways. Graphical formats are perhaps most common, which include outputs from data analyses and saliency maps. They can also be formatted verbally through speech, or written as reports.

Some Algorithms Used in Explainable AI

  • Shapley additive explanation (SHAP): Explains a given prediction by mathematically computing how each feature contributed to the prediction.
  • Local interpretable model-agnostic explanation (LIME): Takes decisions and, by querying nearby points, builds an interpretable model that represents the decision. It then uses that model to provide explanations. 
  • Morris sensitivity analysis: Known as a one-step-at-a-time analysis, meaning only one input has its level adjusted per run. This is commonly used to determine which inputs are important enough to warrant further analysis.
  • Contrastive explanation method (CEM): Used to provide explanations for classification models by identifying both preferable and unwanted features in a model. 
  • Scalable Bayesian rule lists: Creates a list of “if-then” rules, where the antecedents are mined from the data set and the set of rules and their order are learned. 

Related Reading Top 10 Machine Learning Algorithms Every Beginner Should Know


Principles of Explainable AI

The National Institute of Standards and Technology, or the NIST, a government agency within the United States Department of Commerce, has developed four key principles of explainable AI.

The 4 Principles of Explainable AI

  1. Explanation: A system should supply “evidence, support or reasoning” related to a given outcome or process of an AI system. An explanation will likely vary depending on the given system and scenario.
  2. Meaningful: A system must be able to provide an explanation that the intended user(s) can understand, which requires that developers consider their intended audience.
  3. Explanation Accuracy: An explanation must correctly reflect the reason for generating a particular output, as well as accurately reflect the system’s process. 
  4. Knowledge Limits: A system must only operate under the conditions for which it was designed and when it reaches sufficient confidence in its output. Identifying and declaring a system’s limits can safeguard against “misleading, dangerous or unjust outputs.”

Source: The Four Principles of Explainable Artificial Intelligence” by the National Institute of Standards and Technology



The first is that the system should be able to explain its output and provide supporting evidence. According to the NIST report, a system is explainable when it “supplies evidence, support or reasoning related to an outcome or a process of an AI system.” This principle does not offer any specific metric or quality of those explanations. Rather, it simply says that explanations “will vary, and should, according to the given system and scenario.”

There are many types of explanations — some benefit the end user, some are designed to foster trust in the system, and some are meant to meet certain regulatory requirements. Meanwhile, others can help with algorithm development and maintenance. The explanation depends on who the user is, and the context in which the model exists.

As Ehsan puts it, “explainable AI is pluralistic.” There is no one-size-fits-all solution. “There are different types of fields. Different goals, different use cases.”



Whatever the given explanation is, it has to be meaningful, according to the NIST, which brings us to the second principle of explainable AI. If there is a range of users with diverse knowledge and skill sets, the system should provide a range of explanations to meet the needs of those users. 

Incidentally, this is the basis of human-centered explainable AI, or HCXAI, a subfield of explainable AI pioneered by Ehsan and his colleague Mark Riedl in a 2020 research paper, where the human is put at the center of technology design for explainability. It develops a more holistic understanding of “who” the explanation is for. 

“Explainability is a human factor, it is not an algorithmic factor,” Ehsan said. “Not everything we care about lies inside the black box. Critical insights lie outside it. Why? Because that’s where the humans are.”

“Not everything we care about lies inside the black box.”

As an example, he used his first experience as a passenger in a self-driving car, where he was a researcher in a research and development setting. “The engineer of the self-driving car needed very different explanations from the car than what I, as the passenger, did,” he continued. He was given a tablet showing the LiDAR, or light detection and ranging, images of what the car was sensing in real time, and how it was avoiding obstacles. At the same time, the operator was “translating” the machine inputs he was getting from a technical language to one Ehsan could understand. The operator needed to know how the car was operating, while Ehsan just needed to know that the car was navigating safely.


Explanation Accuracy

The explanation also needs to be clear and accurate, which is distinctly different from how accurate the actual AI system is.

“Regardless of the system’s decision accuracy, the corresponding explanation may or may not accurately describe how the system came to its conclusion or action,” the report continued. “Additionally, explanation accuracy needs to account for the level of detail in the explanation.”


Knowledge Limits

Finally, the AI system must operate within its designed “knowledge limits” to ensure a reasonable outcome, meaning it should only operate under the specific conditions for which it was designed, and once it reaches sufficient confidence in its output. “Identifying and declaring” the limits of a given AI system’s knowledge can “increase trust” in the system by “preventing misleading, dangerous or unjust outputs.”

Predictions for Explainable AI and More Here Are 5 AI Trends to Watch in 2023


Explainable AI Use Cases


Speaking of compliance, finance is a heavily regulated industry, so explainable AI is a necessity to holding AI models accountable. Artificial intelligence is used to help assign credit scores, assess insurance claims, improve investment portfolios and much more. If the algorithms used to make these tools are biased, and that bias seeps into the output, that can have serious implications on a user and, by extension, the company.

“There is risk associated with using a model to make, for example, a credit underwriting decision. You don’t just want to roll something out willy-nilly unless it’s met certain kinds of standards,” Rubin said.

It’s also important that other kinds of stakeholders better understand a model’s decision. In finance, this includes people like lending agents or fraud auditors — people who don’t necessarily need to know all of the technical details of the model, but who are able to do their job better when they understand not only what a given model recommends, but also why the model recommends it.


Autonomous Vehicles

Explainability is a high priority for autonomous cars, both on the research and corporate side.

Autonomous vehicles operate on vast amounts of data in order to figure both its position in the world and the position of nearby objects, as well as their relationship to each other. And the system needs to be able to make split-second decisions based on that data in order to drive safely. Those decisions should be understandable to the people in the car, the authorities and insurance companies in case of any accidents.

Why did the car swerve left instead of right? What caused the brakes to be applied? How the heck does this thing even work? All of these, and more, are questions that explainable AI attempts to answer with self-driving cars.



The healthcare industry is one of artificial intelligence’s most ardent adopters, using it as a tool in diagnostics, preventative care, administrative tasks and more. And in a field as high stakes as healthcare, it’s important that both doctors and patients have peace of mind that the algorithms used are working properly and making the correct decisions.

For example, hospitals can use explainable AI for cancer detection and treatment, where algorithms show the reasoning behind a given model’s decision-making. This makes it easier not only for doctors to make treatment decisions, but also provide data-backed explanations to their patients.

Want More AI Industry Adoption? 15 Examples of AI in Supply Chain and Logistics


Explainable AI’s Challenging Future

For all of its promise in terms of promoting trust, transparency and accountability in the artificial intelligence space, explainable AI certainly has some challenges. Not least of which is the fact that there is no one way to think about explainability, or define whether an explanation is doing exactly what it’s supposed to do.

“There is no fully generic notion of explanation,” Lipton, from Carnegie Mellon, said. This runs the risk of the explainable AI field becoming too broad, where it doesn’t actually effectively explain much at all.

“There is no fully generic notion of explanation.”

Lipton likens it to “wastebasket diagnoses” in the medical industry, where a diagnosis is too vague or broad to have any real treatment or solution. “The problem is that you can’t come up with a cure for that category because it doesn’t describe a single disease. It’s too broad. The only way you can make progress is to divide and conquer.”

But, perhaps the biggest hurdle of explainable AI of all is AI itself, and the breakneck pace at which it is evolving. We’ve gone from machine learning models that look at structured, tabular data, to models that consume huge swaths of unstructured data, which makes understanding how the model works much more difficult — never mind explaining it in a way that makes sense. Interrogating the decisions of a model that makes predictions based on clear-cut things like numbers is a lot easier than interrogating the decisions of a model that relies on unstructured data like natural language or raw images.

Nevertheless, it is unlikely that the field of explainable AI is going anywhere anytime soon, particularly as artificial intelligence continues to become more entrenched in our everyday lives, and more heavily regulated.

“As long as there are high stakes involved, and there’s a need for accountability, and the truth remains that AI systems are fallible — they’re not perfect — explainability will always be a need,” Ehsan said. Whether explainability will come part and parcel with AI “remains to be seen,” he added, simply because AI has and is continuing to change so much. “What we meant by AI even 10 years ago is very different from what we mean by AI now. So what it means to be explainable could mean very different things even in 10 years.”

Great Companies Need Great People. That's Where We Come In.

Recruit With Us