Model Interpretability in a Nutshell

Eero LaaksonenEero Laaksonen

Software is eating the world, and machine learning is eating software. Machine learning models solve countless use cases where traditional software struggles. These use cases are often too complex to define and code rigid rules for, but machine learning models can be trained to find the most complex patterns in data.

Complexity, however, brings the challenge that it may be almost impossible for a person to determine how the ML model arrived at its predictions. This is where model interpretability comes in. As we rely on this technology more and more, we must have transparency to understand the results.

Defining Interpretability

Let's start by defining interpretability in the context of machine learning and AI. In simple terms, it means how easily a human can interpret how the model arrived at a decision.

What were the factors that led to the prediction? What was the process the AI tool used to reach a decision? The clearer you can answer these questions, the more interpretable your machine learning model is.

Another term that goes together with interpretability is explainable AI. The goal is to explain why and how a model did something. If you can understand why a model works in simple terms, then that model is interpretable.

What Makes ML Difficult to Interpret?

You may be wondering what makes ML challenging to interpret – especially since humans developed this technology. The logic behind machine learning isn't understandable by default, and as we use the technology for increasingly complex applications and large datasets, our ability to understand and explain results decreases even further.

When you build an ML model, you are essentially creating an algorithm through countless tiny iterations (until the algorithm manages to capture the patterns you want – or hopefully want). This can lead to a black-box model, where we provide inputs and let the AI perform complex calculations to reach a decision. In other words, we won't know what features and inputs the model finds important, how it made its decision, or even what it looks at.

Similarly, the model may be trained with data that contains implicit biases. There can be societal biases, prejudice, and stereotypes built into datasets that we may not even realize. For example, a particular group of people may be underrepresented, leading to unfair interpretations.

If you combine these factors, you end up with AI that may be accurate -- but we have no idea how or why it works.

How to Make Models More Transparent

The good news is there are steps you can take to make models more transparent! Increasing the interpretability of your AI will improve your results and accuracy. It will also encourage more people to adopt the technology.

Explainable AI refers to a framework that lets you interpret how your machine learning models work and better understand the results. These tools allow you to dive into the model's behavior, so you can debug it and improve performance, as well as explain predictions and outputs.

Explainable AI tools may shed light on how much each of the variables contributed to a prediction. They expose features the algorithm considered most in arriving at a decision and what data was excluded.

A few examples of XAI tools:

  • The Google-backed What If Tool allows you to visualize how different data points affect predictions of trained Tensorflow models.

  • The Microsoft-backed InterpretML is a toolkit that similarly helps visualize and explain predictions.

For instance, if you have an AI tool that helps you determine the creditworthiness of a mortgage applicant, an XAI report may tell you how much weight was given to their credit score versus their debt-to-income ratio.

If your machine learning model is interpretable, you will have no difficulty explaining why an individual's credit application was denied. This is especially important when operating in highly regulated industries. Likewise, if the borrower seems to meet all the criteria for approval but gets denied, you must be able to isolate the area of the code that caused this to happen.

Why Does it Matter Whether Models are Interpretable?

As you may have guessed, uninterpretable and black box models pose a big issue in regulated fields like finance and healthcare. In these industries, it's not enough to know what was predicted -- there needs to be transparency into what went into the decision and why it was made. The value of interpretability increases with the impact the predictions have on the end-users lives. It may also increase with what data the model uses to make those predictions; Personal information will often come with a more substantial need for interpretability because bias may be introduced unknowingly.

While interpretability may be less crucial for an AI predicting customer churn, it is a must-have for a machine learning model making medical decisions. Doctors need to rely on the predictions made by the algorithm - but they also need enough details to explain to the patient and their families why that treatment program is recommended.

Additionally, if you are unable to explain the predictions made by the model, this may cause your stakeholders or end-users to distrust the system. Can you blame them, though? If your life is on the line, you need to be confident in the training data, how the model interprets new inputs, and how it comes up with predictions.

It is necessary to improve the interpretability of your models if you wish to apply them to high-risk areas, such as medical and financial applications. The future of AI's ability to shape these industries depends on the widespread adoption of explainable models that have built-in trust and transparency.

Practical MLOps

Free eBook

Practical MLOps

How to get started with MLOps?