Add random forest option for partial dependence #13
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
@djgagne @jsschreck Following our discussion on the MILES slack about the behavior of XGBoost sometimes generating negative values in partial dependence plots, I've attempted to update the code to add a random forest option. I set it up to use RF as the default, but that could be changed. I tested this with an existing echo run, and the PD plots are comparable to those produced with XGB, but still generating negative values which I'm not sure makes sense for these metrics. Here are some example comparisons using a set of metrics from @mariajmolina - noting the vcorr_cust metric is likely out of date, but the valid_mae and vmse_extreme_outp should be relevant (and I believe should not have negative values).
I welcome comments/suggestions on any of the above, plus how best to let users choose between the two options, in addition to general comments about how the RF model is configured (e.g., hyperparameters).