ML Model Transparency: LIME for Explainability in Python

Enhance model transparency and gain deeper insights into predictions with comprehensive explanations, enabling informed decisions and building trust in AI systems

Amit Kulkarni

Published in

AI Advances

10 min readFeb 23, 2024

In this blog, we will cover the below topics

Introduction
The dataset and data load
The building block of modeling
- Decision tree model
- Logistic regression model
Need for model explainability
- Random forest model
What is LIME?
Application of LIME
Explaining Model Predictions with LIME
Pros and Cons of LIME
Conclusion & FAQs

Introduction

Machine learning is a crucial tool in today’s business landscape, enabling companies to tackle complex challenges. With the exponential growth of data volume and intricate use cases, data scientists are increasingly using sophisticated models for analysis and predictions. However, the balance between model accuracy and ease of interpretation remains a challenge. Advanced techniques like Bagging, Boosting, and Random Forests are often used for high accuracy but lack transparency, making it difficult for stakeholders to understand their inner workings. Simpler models like Linear Regression or Decision Trees offer greater interpretability but may sacrifice accuracy. This dilemma highlights the need for trust in machine learning models for widespread adoption across industries. This blog explores balancing accuracy and understandability in machine learning models, aiming to make them more accessible for everyone.

Data loading

We will be using the data from Playground Series — Season 4, Episode 1 Binary Classification with a Bank Churn Dataset from Kaggle. The detailed definition of each of the variables can be found in the Kaggle dataset description.

I have used the same dataset in my previous blogs where we learned 3 approaches to building ensemble models and the second where we used automation tools like Autogluon and Autoviz for effortless modeling. This blog is not a sequel to these blogs but i thought it would be better to have same dataset, experiment on it and learn from it.

Model Building

The ML model we are developing uses data to predict customer behavior. To explain its effectiveness, we are using simple models like decision trees or logistic regression. These models provide clear insights into decision-making processes and allow us to understand why certain predictions were made and the factors that influenced them. By using these models, we aim to understand why customers might leave and what actions can be taken to retain them, ultimately enhancing the bank’s ability to make informed decisions and better serve its customers. In the next section, we will build 2 models — Decision tree and Logistic regression models to understand their explainability.

Decision tree model

This model is easy to build and explain. We can also visualize the visual output of how the model has classified and arrived at the prediction. We can see the decision tree visual in Fig1 below.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier, plot_tree
from sklearn.metrics import accuracy_score

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Build a Decision Tree classifier
clf = DecisionTreeClassifier(random_state=42)
clf.fit(X_train, y_train)

# Visualize the Decision Tree
plt.figure(figsize=(15, 10))
plot_tree(clf, filled=True, feature_names= df_train_combined.columns.drop(target), class_names=target)
plt.show()

# Evaluate the model
y_pred_train = clf.predict(X_train)
train_accuracy = accuracy_score(y_train, y_pred_train)
print("Training Accuracy:", train_accuracy)

y_pred_test = clf.predict(X_test)
test_accuracy = accuracy_score(y_test, y_pred_test)
print("Test Accuracy:", test_accuracy)

Source: Author | Fig 1: Visualizing the decision tree

The model’s visual representation aims to understand the format of data, enabling data scientists to identify factors influencing outcomes. This approach, resembling rule-based classification, is more embraced by customers and businesses due to its straightforward explanation of predictions, making it easier to trust and adopt for decision-making.

Logistic regression model

Logistic regression predicts the probability of an outcome based on input features and then decides whether the outcome is likely to happen (yes) or not (no). It works by finding the best-fit line that separates the data into two groups.

import numpy as np
import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, 
      random_state=42)

# Build a Logistic Regression model with increased max_iter
clf_LogisticRegression = LogisticRegression(max_iter=1000, random_state=42)
clf_LogisticRegression.fit(X_train, y_train)

# Evaluate the model
train_accuracy = clf_LogisticRegression.score(X_train, y_train)
print("Training Accuracy:", train_accuracy)

test_accuracy = clf_LogisticRegression.score(X_test, y_test)
print("Test Accuracy:", test_accuracy)

-------------------------------------------------------------------
OUTPUT:
Training Accuracy: 0.85
Test Accuracy: 0.83

In logistic regression, we don’t visually track the model’s decision-making like in decision trees. Instead, we get a clear equation that guides predictions. For any given record, we receive predicted probabilities for each class. For example, in the code below, for the first record, we get predicted probabilities for each class ([0.95404717, 0.04595283]) along with the predicted class (0 in this case).

This equation helps us understand how each feature influences the prediction. This makes logistic regression useful when we need to explain and interpret the model’s decisions, especially in scenarios where transparency and insight into individual features’ impact are essential.


record_index = 1
predicted_probabilities = clf_LogisticRegression.predict_proba(
                              X_test.iloc[[record_index]])
predicted_class = clf_LogisticRegression.predict(
                              X_test.iloc[[record_index]])
print("Predicted Probabilities for Record {}: {}".format(record_index, 
predicted_probabilities))
print("Predicted Class for Record {}: {}".format(record_index, 
predicted_class))

intercept = clf_LogisticRegression.intercept_[0]
coefficients = clf_LogisticRegression.coef_[0]

# Write the equation
equation = f"P(y=1 | X) = 1 / (1 + e^(-({intercept:.4f} {' + ' 
     if intercept > 0 else ''} {' + '.join([f'({coef:.4f} * {feat})' 
     for coef, feat in zip(coefficients, X.columns)])})))"

print("Logistic Regression Equation:")
print(equation)

--------------------------------------------------------------------
OUTPUT:
Predicted Probabilities for Record 1: [[0.95404717 0.04595283]]
Predicted Class for Record 1: [0]

Logistic Regression Equation:
P(y=1 | X) = 1 / (1 + e^(-(-2.3000  (1.3355 * Geography_Germany) + 
(0.3696 * Geography_Spain) + (-0.7362 * Gender_Male) + 
(-0.2597 * CustomerId) + (0.3967 * CreditScore) + (4.3726 * Age) + 
(-0.3758 * Tenure) + (0.2839 * Balance) + (-0.6963 * NumOfProducts) + 
(-0.0033 * HasCrCard) + (-1.2023 * IsActiveMember) + 
(0.4897 * EstimatedSalary))))

Let's look at the first 50 predictions and their predicted probabilities and classes.

for i in range(50):
    predicted_probabilities = clf_LogisticRegression.predict_proba(X_test.iloc[[i]])
    predicted_class = clf_LogisticRegression.predict(X_test.iloc[[i]])
    print(f'Record: {i} | Probability: {predicted_probabilities} | Predicted Class: {predicted_class}')

--------------------------------------------------------------------
OUTPUT:

Record: 0 | Probability: [[0.82013921 0.17986079]] | Predicted Class: [0]
Record: 1 | Probability: [[0.95404717 0.04595283]] | Predicted Class: [0]
...
Record: 22 | Probability: [[0.48815073 0.51184927]] | Predicted Class: [1]
Record: 23 | Probability: [[0.31007358 0.68992642]] | Predicted Class: [1]
Record: 24 | Probability: [[0.74217834 0.25782166]] | Predicted Class: [0]
...
Record: 48 | Probability: [[0.84526572 0.15473428]] | Predicted Class: [0]
Record: 49 | Probability: [[0.3497979 0.6502021]]   | Predicted Class: [1]

So what is the challenge?

As data complexity grows, basic models like logistic regression and decision trees may struggle. To address this, data scientists turn to more complex models like Random Forest, LightGBM, or XGBoost. However, these models are harder to explain — they’re like black boxes. Businesses may hesitate to adopt them because they can’t easily understand why a specific prediction is made. While these models offer superior performance, their opacity can be a drawback. It’s a trade-off between accuracy and interoperability and as data science evolves, striking a balance between model complexity and explainability remains a key challenge. In this next section let’s build a boosting model (Random forest) to understand this better.

RandomForest model

Random Forest is a robust ensemble learning technique used for classification and regression tasks. It constructs multiple decision trees during training and outputs individual trees' mode or mean prediction. Unlike logistic regression which is a linear model and simple, and easy to interpret, the random forest is a Non-linear model with greater flexibility and performance at the cost of interpretability.

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, 
        random_state=42)
clf_RFC = RandomForestClassifier(random_state=42)
clf_RFC.fit(X_train, y_train)

# Evaluate the model
train_accuracy = clf_RFC.score(X_train, y_train)
print("Training Accuracy:", train_accuracy)
test_accuracy = clf_RFC.score(X_test, y_test)
print("Test Accuracy:", test_accuracy)

record_index = 38
predicted_probabilities_rf = clf_RFC.predict_proba(X_test.
iloc[[record_index]])
predicted_class_rf = clf_RFC.predict(X_test.iloc[[record_index]])
print(f'Random Forest Predicted Probabilities for Record: 
{record_index} and probabilities are : {predicted_probabilities_rf} and 
predicted class is : {predicted_class_rf}')

-----------------------------------------------------------------------

OUTPUT:
Training Accuracy: 1.0
Test Accuracy: 0.85
Random Forest Predicted Probabilities for Record: 38 and 
probabilities are : [[0.77 0.23]] and predicted class is : [0]

We neither have a visual representation of the decision tree nor the equation from logistic regression. So is there a method to enhance the explainability of the Random Forest model, allowing for detailed explanations of specific predictions and an understanding of the influencing factors? Achieving this would create a potent yet interpretable model. That's where the LIME comes to the rescue and will explore this in the next section.

What is LIME?

LIME, or Local Interpretable Model-agnostic Explanations, is a machine learning tool that enhances model interpretability by training local surrogate models to explain individual predictions. It differs from traditional methods that analyze black box models’ internal components. LIME perturbs input data samples and observes changes in predictions, providing insights into model behavior. It is model-agnostic, allowing it to be applied to any machine learning model, making it versatile for explaining predictions across various domains and applications.

Application of LIME

Now, let's use LIME on the random forest model built in the previous section and try to explain the model results.

from lime.lime_tabular import LimeTabularExplainer

# Initialize the LimeTabularExplainer with valid feature names
explainer = LimeTabularExplainer(X_train.values, 
                                 feature_names=X_train.columns.tolist(), 
                                 class_names=df_train[target].unique(), 
                                 mode='classification')

# Explain the prediction of the Random Forest model for the chosen instance
exp = explainer.explain_instance(X_test.iloc[[record_index]].values[0], 
        clf_RFC.predict_proba, num_features = 10)

# Visualize the explanation
exp.show_in_notebook(show_table=True)
fig = exp.as_pyplot_figure()
plt.tight_layout()
plt.show()

We start by importing the LIME
Instantiate the explainer object and for this, we need the features, the target (0 or 1), and the mode which is classification in our case
We randomly pick a record with a variable ‘record_index’ and let the explainer help us with interpretation. We will set num_features as 10 meaning we need the top 10 features impacting the prediction.

Interpreting the output

Iteration 1: Record 38

Source: Author | Model and feature explanation for record # 38

How to interpret this result:

For record #38, the random forest model classified it as 0 and the predicted probabilities were [0.84, 0.16]
The bar chart shows that Red bars influence the prediction of 0 (Malignant)

Iteration 2: Record 23

Source: Author | Model and feature explanation for record # 23

How to interpret this result:

For record #23, the random forest model classified it as 1 (Benign) and the predicted probabilities were [0.24, 0.76]
The bar chart shows that Green bars influence the prediction of 1(Benign)

Note: The bar chart to the right in each of the above plots is not from the LIME library and I included this in the output using Matplotlib just to draw parellel to ouput from one more library called SHAP which is used in understanding of the decision-making processes of complex models. If you wish to know more on shap, please refer to Inside the Black Box: A Practical Approach to Model Interpretability

Pros:

Model Agnostic: LIME is versatile and able to be applied to any machine learning model, including black box models, regardless of complexity or algorithm.
Local Interpretability: LIME generates local surrogate models to explain individual predictions, providing interpretable explanations for specific instances, and aiding in understanding model behavior at a granular level.
Versatility: LIME is a versatile tool that can handle various types of data, including structured, unstructured, and mixed data, making it suitable for various domains and applications.

Cons:

Computational Overhead: Local surrogate models can be computationally expensive, particularly for large datasets or complex models, resulting in longer explanation times and increased computational resources.
Sensitivity to Perturbations: LIME’s sensitivity to perturbations in input data samples can lead to inconsistent explanations and require careful interpretation due to its varying explanations for similar instances.
Limited Global Understanding: LIME offers individual predictions but may not provide a comprehensive model understanding due to its focus on local interpretability and not capturing global trends or data relationships.

The complete code used in this blog can be found on Github

Conclusion

In summary, achieving a balance between the accuracy and explainability of machine learning models is vital. This not only fosters trust among stakeholders but also boosts the confidence of data science teams in the model’s performance, as they can justify its predictions. Tools like LIME play a crucial role by providing clear and easy explanations that are essential for driving innovation and solving complex problems.

I hope you liked the article and found it helpful.

Connect with me

Collection of blogs

Data Science Using Python and R
Python For Finance
App Development Using Python
GeoSpatial Analysis Using Python

FAQs

Q1: Why do machine learning models need to be explainable?
A1: Explainability is crucial for users to trust and understand how models make predictions. It provides transparency into the decision-making process, helping users validate the model’s outputs and identify potential biases or errors.

Q2: What are the benefits of balancing accuracy and explainability in machine learning models?
A2: The model’s reliability and understandability are ensured by balancing accuracy and explainability, fostering confidence in its performance and facilitating its adoption across various business functions and industries.

Q3: How does LIME contribute to model explainability?
A3: LIME is a powerful tool that provides clear, intuitive explanations for model predictions by perturbing input data around specific instances, allowing users to understand the factors driving individual predictions.

Q4: Is it possible for models to be accurate and explainable at the same time?
A4: Yes, LIME enables high accuracy and interpretability in machine learning models, balancing complexity with transparency by providing clear explanations for predictions, thereby enhancing user experience.

Q5: How can businesses leverage explainable machine learning models for problem-solving?
A5: Explainable machine learning models enable businesses to make informed decisions, optimize processes, and drive innovation by providing transparent insights and understanding the rationale behind model predictions.