-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Native Pseudo-Huber loss support #5479
Comments
@LionOrCatThatIsTheQuestion Would you like to make a PR for this? It should just be a simple class defined in |
@trivialfis what evaluation metric should I use, rmse or mae would be my first guess? Here is my code so far:
|
Does it make sense to use Pseudo-Huber loss as a metric? |
Guess Pseudo-Huber loss would be an option too (seems natural to choose the same metric as loss function?) or MAE. The idea was to implemented Pseudo-Huber loss as a twice differentiable approximation of MAE, so on second thought MSE as metric kind of defies the original purpose. |
The advantage of MAE (and also MSE), is that they are better/natural interpretable. Pseudo-Huber loss does not have the same values as MAE in the case "abs(y_pred - y_true) > 1", it just has the same linear shape as opposed to quadratic. |
@LionOrCatThatIsTheQuestion We can set the default metric to be huber, as users can specify other metrics if they like. To me using huber as the default metric seems appropriate here. You can add a metric in |
@trivialfis Could you explain what the function GetFinal(...) does? I used MAE as reference: struct EvalRowPHE {
char const *Name() const {
return "phe";
}
XGBOOST_DEVICE bst_float EvalRow(bst_float label, bst_float pred) const {
bst_float diff = label - pred;
return std::sqrt(1 + diff * diff) - 1;
}
static bst_float GetFinal(bst_float esum, bst_float wsum) {
return wsum == 0 ? esum : esum / wsum;
}
};
`` ` |
@LionOrCatThatIsTheQuestion Is there any reason we should fix |
For normal cases |
\delta should be 1 by default, but adjustable would be better than fixed - the question is more if its possible and how to implement an additional parameter for a metric? e.g. in sklearn interface, I would just use the keyword 'reg:pseudohubererror' to specify the metric |
Hi, is it possible to relax the constrain of delta equals 1 so that user could choose other delta such as 1.35 to obtain achieve 95% statistical efficiency? Or maybe just set its default as 1.35 to be compatible with sklearn? Reference: https://scikit-learn.org/stable/modules/linear_model.html#huber-regression |
Passing an additional parameter for a metric is done for poisson regression and tweedie regression for example. See: xgboost/src/objective/regression_obj.cu Lines 457 to 464 in 03cd087
Making xgboost/src/objective/regression_loss.h Lines 101 to 112 in 03cd087
xgboost/src/objective/regression_obj.cu Lines 41 to 42 in 03cd087
|
Reopening as a reminder. |
Any plans on making Delta adjustable? Turns out PseudoHuberLoss is very effective against outliers and I would like to tune delta further to get better results. |
Would it be possible to support Pseudo-Huber-loss (https://en.wikipedia.org/wiki/Huber_loss#Pseudo-Huber_loss_function) natively?
I implemented it as a custom losss function (I use the Python SKLearn API)
but the feature importance plots don't support custom loss functions (and it slows the learning process in comparison to 'reg:squarederror').
The basic problem is the need for a robust regession objective; MSE can be sensitive to outliers in application.
The text was updated successfully, but these errors were encountered: