-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding SHAP predict values as new output file #982
Adding SHAP predict values as new output file #982
Conversation
Signed-off-by: mattahrens <[email protected]>
Signed-off-by: mattahrens <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mattahrens
The csv file shap_values.csv
is missing in the PR
The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cindyyuanjiang Please take a look at this PR as you were adding the CLI to see if there is anything that affects the CLI feature.
@mattahrens LGTM. Keep in mind that this prints the shap values across the entire dataset. If you want to see per-sample values, then you'd need to modify this line, but then you'd also end up with a dataframe of shape (num_samples, num_features) vs (num_features, 1) currently. However, I'm not sure that would be very usable vs. debugging via individual shap waterfall plots offline. |
I wanted to start with the values across the entire dataset and get feedback. And then if we want to go per-sample, we can enhance it later to provide that. |
New output file
shap_values.csv
in xgboost_predictions folder will contain feature importance for prediction set.