[feature proposal] Load best model after early stopping #4052

PhilipMay · 2019-01-10T15:05:59Z

Could you please implement a way to load the best model after ealy stopping? LightGBM is also doing this by default. See here: https://lightgbm.readthedocs.io/en/latest/Python-Intro.html#early-stopping

That would be a great improvement to my hyperparameter optimization workflow.

Thanks
Philip

trivialfis · 2019-01-13T12:10:26Z

Seems to be a nice feature. Will look into this in the future. @hcho3 WDYT?

hcho3 · 2019-01-18T20:02:15Z

@trivialfis @PhilipMay Can you clarify what you mean? Currently, the scikit-learn interface automatically uses the best iteration when predicting:

xgboost/python-package/xgboost/sklearn.py

Lines 419 to 421 in 1fc37e4

    
                   ntree_limit : int 
        
                       Limit number of trees in the prediction; defaults to best_ntree_limit if defined 
        
                       (i.e. it has been trained with early stopping), otherwise 0 (use all trees).

PhilipMay · 2019-01-18T20:39:08Z

@hcho3 This is very interesting. So with the scikit-learn interface it should work the way I would like it to work (load or use the best model). But what about the situation where you use the Booster class and do not use the scikit-learn interface? In the Booster class it just says:

Limit number of trees in the prediction; defaults to 0 (use all trees).

See here:

xgboost/python-package/xgboost/core.py

Line 1190 in 1fc37e4

Limit number of trees in the prediction; defaults to 0 (use all trees).

For me this means that it does not use the best model from early stopping.

hcho3 · 2019-01-18T20:48:48Z

Yes, it appears that currently you need to use the scikit-learn interface to do what you want. (And use pickle to save XGBClassifier / XGBRegressor objects.) It would be nice to add num_iteration option to save_model() function so that you can truncate the model at the time of serializing.

PhilipMay · 2019-01-18T21:18:46Z

Yes, it appears that currently you need to use the scikit-learn interface to do what you want. (And use pickle to save XGBClassifier / XGBRegressor objects.) It would be nice to add num_iteration option to save_model() function so that you can truncate the model at the time of serializing.

Yes - and it would be nice to bring scikit-learn interface and the Booster interface to a consistent state where they both bring the best results form early stopping by default.

Or this this just a documentation bug?

hcho3 · 2019-01-18T21:52:17Z

The issue is that only the scikit-learn interface saves the best number of round; the Booster object does not save it. As I said, it would be cleaner to simply truncate the model at the time of serializing (save_model()) to achieve consistency.

hcho3 added the feature-request label Mar 8, 2019

trivialfis mentioned this issue Apr 20, 2020

[R] Best iteration index from early stopping is discarded when model is saved to disk #5209

Closed

trivialfis mentioned this issue Oct 5, 2020

[Roadmap] 1.3.0 Roadmap #6031

Closed

14 tasks

trivialfis mentioned this issue Oct 20, 2020

Slice model #6260

Closed

trivialfis mentioned this issue Oct 28, 2020

Support slicing tree model #6302

Merged

hcho3 closed this as completed in #6302 Nov 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature proposal] Load best model after early stopping #4052

[feature proposal] Load best model after early stopping #4052

PhilipMay commented Jan 10, 2019

trivialfis commented Jan 13, 2019

hcho3 commented Jan 18, 2019

PhilipMay commented Jan 18, 2019 •

edited

Loading

hcho3 commented Jan 18, 2019

PhilipMay commented Jan 18, 2019 •

edited

Loading

hcho3 commented Jan 18, 2019

[feature proposal] Load best model after early stopping #4052

[feature proposal] Load best model after early stopping #4052

Comments

PhilipMay commented Jan 10, 2019

trivialfis commented Jan 13, 2019

hcho3 commented Jan 18, 2019

PhilipMay commented Jan 18, 2019 • edited Loading

hcho3 commented Jan 18, 2019

PhilipMay commented Jan 18, 2019 • edited Loading

hcho3 commented Jan 18, 2019

PhilipMay commented Jan 18, 2019 •

edited

Loading

PhilipMay commented Jan 18, 2019 •

edited

Loading