Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support min_delta in early stopping. #7137

Merged
merged 2 commits into from
Aug 3, 2021
Merged

Conversation

trivialfis
Copy link
Member

Close #7136 .
Close #7066 .

@codecov-commenter
Copy link

codecov-commenter commented Jul 28, 2021

Codecov Report

Merging #7137 (26ac0c2) into master (92ae3ab) will not change coverage.
The diff coverage is 83.33%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #7137   +/-   ##
=======================================
  Coverage   82.19%   82.19%           
=======================================
  Files          13       13           
  Lines        3999     3999           
=======================================
  Hits         3287     3287           
  Misses        712      712           
Impacted Files Coverage Δ
python-package/xgboost/callback.py 82.20% <83.33%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 92ae3ab...26ac0c2. Read the comment docs.

@MarkGuryanov
Copy link

@trivialfis I'm not sure why we need to have both abs_tol and min_delta. Maybe I misunderstand the idea of abs_tol, but I thought of it as minimum change in the monitored metric in the direction of improvement to qualify as a substantial improvement. In this case we must not tolerate the event when metric gets worse and surely we must not consider this as an "improvement", because as I've described it in #7136, if we do this, the model will continue to train while metric on evaluation set gets slightly worse each step, and in the long run it will lead to overfitting, effectively nullifying the early stopping mechanism. Thus we must expect the metric to be strictly greater than previous best to consider it as an "improvement",

It will be useful to rename parameter abs_tol to min_delta and fix the calculations (either by switching + to - or as it is done in #7066 - mathematically it is the same), but having both of them and requiring at least one to be 0 will be quite misleading for users IMHO.

Also, looking into APIs of early stopping callbacks from other frameworks (for example, pytorch-lightning and tensorflow), I see only one parameter min_delta and no tolerance parameter.

@trivialfis
Copy link
Member Author

My initial thought is just to allow some turbulence in validation results during training. If that's not useful and doesn't make sense I can remove the abs_tol completely.

@trivialfis
Copy link
Member Author

Your argument make sense to me. Will follow up with the removal.

@trivialfis
Copy link
Member Author

Removed.

Copy link

@MarkGuryanov MarkGuryanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think everything is fine now. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EarlyStopping can incorrectly detect metric improvement with positive tolerance
4 participants