Skip to content
This repository has been archived by the owner on Jul 17, 2023. It is now read-only.

Interpretability of Models #18

Open
mtanco opened this issue Mar 23, 2021 · 2 comments
Open

Interpretability of Models #18

mtanco opened this issue Mar 23, 2021 · 2 comments
Labels
area/ml Machine learning related issues type/design

Comments

@mtanco
Copy link

mtanco commented Mar 23, 2021

For H2O-3, this code snippet may be helpful. For a specific row in the dataset it creates a table which has the results of trying different values for the users and looking at how the prediction changes. This is really important for explainability. This code specifically creates the partial dependence plot for the top positive and negative feature:

import h2o
from h2o.automl import H2OAutoML
h2o.init()

# Load data into H2O
df = h2o.import_file('https://h2o-internal-release.s3-us-west-2.amazonaws.com/data/Splunk/churn.csv')
y = 'Churn?'
x = df.columns
x.remove(y)

# Build models
aml = H2OAutoML(max_models = 2, seed = 1)
aml.train(x = x, y = y, training_frame = df)

# Save the best model
model = aml.leader

# Get how much each feature contributed for each person
pred_contribs = model.predict_contributions(df).drop('BiasTerm').as_data_frame()

# ID of the phone nubmer 
row_id = 77


# Columns that are important for this user
min_contrib = pred_contribs.idxmin(axis=1)[row_id]
max_contrib = pred_contribs.idxmax(axis=1)[row_id]


min_pdp = model.partial_plot(
    df, 
    cols=[min_contrib],
    plot=False,  # change to false, just for debugging
    nbins=20 if not df[max_contrib].isfactor()[0] else 1 + df[max_contrib].nlevels()[0],
    row_index=0
)
display(min_pdp)


max_pdp = model.partial_plot(
    df, 
    cols=[min_contrib],
    plot=False,  # change to false, just for debugging
    nbins=20 if not df[max_contrib].isfactor()[0] else 1 + df[max_contrib].nlevels()[0],
    row_index=0
)
display(max_pdp)
@mtanco mtanco added the type/feature Feature request label Mar 23, 2021
@geomodular geomodular added area/ml Machine learning related issues and removed type/feature Feature request labels Apr 28, 2021
@vopani
Copy link
Contributor

vopani commented Jun 14, 2021

This example: https://wave.h2o.ai/docs/examples/ml-h2o-shap shows how to get the SHAP values from a WaveML model. I think that is good enough for developers to build custom downstream plots/cards.

@geomodular Is there anything more we want to accomplish with WaveML regarding this?

@geomodular
Copy link
Collaborator

The ideal goal would be to explain the model using Wave ML interface without interfering with .model param. i.e. m.explain(). The same should be doable with DAI model as well.

That's the general idea we can bend.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/ml Machine learning related issues type/design
Projects
None yet
Development

No branches or pull requests

3 participants