Expose the secondary stopping condition for QN solver #3777

achirkin · 2021-04-21T09:30:36Z

Expose a parameter delta of the QN solver to control the loss value change stopping condition
Set a reasonable default for the parameter value that should keep the behavior close to sklearn in most cases

Note, this change does not expose delta to the wrapper class LogisticRegression.

Note, although this change does not break the python API, it does break the C/C++ API.

Contributes to solving #3645

achirkin · 2021-04-21T09:42:26Z

The change in convergence (issue #3645 ).

Before (secondary condition disabled):

After (delta = tol):

NB: these benchmarks are done on top of #3766 and #3774

achirkin · 2021-04-21T09:44:00Z

We also may need to readjust the default parameter value if we change again the stopping condition logic in #3766

python/cuml/solvers/qn.pyx

dantegd

Code looks good, just a couple of comments about docstrings

cpp/include/cuml/linear_model/glm.hpp

python/cuml/solvers/qn.pyx

tfeher

Thanks Artem for the PR! Looks good overall, just one suggestion: it would make sense to expose delta in logistic_regression.pyx as well.

python/cuml/solvers/qn.pyx

achirkin · 2021-04-26T08:14:55Z

it would make sense to expose delta in logistic_regression.pyx as well.

I can, but I am not 100% sure we should. Are we sure any other solvers in future will have this parameter as well? We do hide some parameters already, e.g. lbfgs_memory. Also sklearn does not expose this parameter too.

codecov-commenter · 2021-04-26T10:17:34Z

Codecov Report

❗ No coverage uploaded for pull request base (branch-0.20@2f0af34). Click here to learn what that means.
The diff coverage is n/a.

@@              Coverage Diff               @@
##             branch-0.20    #3777   +/-   ##
==============================================
  Coverage               ?   85.96%           
==============================================
  Files                  ?      225           
  Lines                  ?    17000           
  Branches               ?        0           
==============================================
  Hits                   ?    14614           
  Misses                 ?     2386           
  Partials               ?        0

Flag	Coverage Δ
dask	`48.91% <0.00%> (?)`
non-dask	`77.81% <0.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2f0af34...3066fd2. Read the comment docs.

tfeher

Thanks Artem for updating the PR title and description. Just to note down our offline discussion: you recommend to keep the LogisticRegression API unchanged (not exposing delta there) because:

current default is expected to be small enough in comparison to sklearn,
points mentioned here Expose the secondary stopping condition for QN solver #3777 (comment).

Sounds fair to me.

dantegd · 2021-04-26T15:10:36Z

@gpucibot merge

- Expose a parameter `delta` of the `QN` solver to control the loss value change stopping condition - Set a reasonable default for the parameter value that should keep the behavior close to sklearn in most cases Note, this change does not expose `delta` to the wrapper class `LogisticRegression`. Note, although this change does not break the python API, it does break the C/C++ API. Contributes to solving rapidsai#3645 Authors: - Artem M. Chirkin (https://github.com/achirkin) Approvers: - Tamas Bela Feher (https://github.com/tfeher) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#3777

achirkin added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Apr 21, 2021

github-actions bot added CUDA/C++ Cython / Python Cython or Python issue labels Apr 21, 2021

achirkin requested a review from tfeher April 21, 2021 09:31

achirkin self-assigned this Apr 21, 2021

achirkin marked this pull request as ready for review April 21, 2021 09:42

achirkin requested review from a team as code owners April 21, 2021 09:42

achirkin commented Apr 21, 2021

View reviewed changes

python/cuml/solvers/qn.pyx Outdated Show resolved Hide resolved

achirkin force-pushed the enh-ext-qn-expose-delta branch 2 times, most recently from d05ac68 to 24684ad Compare April 22, 2021 06:17

dantegd added the 3 - Ready for Review Ready for review by team label Apr 22, 2021

dantegd requested changes Apr 25, 2021

View reviewed changes

cpp/include/cuml/linear_model/glm.hpp Outdated Show resolved Hide resolved

python/cuml/solvers/qn.pyx Show resolved Hide resolved

dantegd added 4 - Waiting on Author Waiting for author to respond to review and removed 3 - Ready for Review Ready for review by team labels Apr 25, 2021

achirkin added 3 commits April 26, 2021 08:34

Exposed target value stopping condition (delta)

95eca57

Add delta to get_param_names

14be2d1

Enhanced docs

6e5ed6d

achirkin force-pushed the enh-ext-qn-expose-delta branch from 24684ad to 6e5ed6d Compare April 26, 2021 07:43

Make flake8 happier

039b705

tfeher requested changes Apr 26, 2021

View reviewed changes

python/cuml/solvers/qn.pyx Outdated Show resolved Hide resolved

python/cuml/solvers/qn.pyx Outdated Show resolved Hide resolved

Don't modify delta in __init__

3066fd2

achirkin changed the title ~~Expose the secondary stopping condition for QN solver (logistic regression)~~ Expose the secondary stopping condition for QN solver Apr 26, 2021

achirkin added 3 - Ready for Review Ready for review by team and removed 4 - Waiting on Author Waiting for author to respond to review labels Apr 26, 2021

achirkin requested review from dantegd and tfeher April 26, 2021 11:40

tfeher approved these changes Apr 26, 2021

View reviewed changes

achirkin mentioned this pull request Apr 26, 2021

Tolerate QN linesearch failures when it's harmless #3791

Merged

dantegd approved these changes Apr 26, 2021

View reviewed changes

rapids-bot bot merged commit f71cbc1 into rapidsai:branch-0.20 Apr 26, 2021

achirkin mentioned this pull request Apr 27, 2021

[BUG] Logistic regression coefficients (for feature importance) significantly differ from Scikit-learn #3645

Closed

tfeher mentioned this pull request May 7, 2021

Accuracy issues in Logistic Regression with L1 penalty #1293

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose the secondary stopping condition for QN solver #3777

Expose the secondary stopping condition for QN solver #3777

achirkin commented Apr 21, 2021 •

edited

Loading

achirkin commented Apr 21, 2021 •

edited

Loading

achirkin commented Apr 21, 2021

dantegd left a comment

tfeher left a comment

achirkin commented Apr 26, 2021

codecov-commenter commented Apr 26, 2021

tfeher left a comment

dantegd commented Apr 26, 2021

Expose the secondary stopping condition for QN solver #3777

Expose the secondary stopping condition for QN solver #3777

Conversation

achirkin commented Apr 21, 2021 • edited Loading

achirkin commented Apr 21, 2021 • edited Loading

achirkin commented Apr 21, 2021

dantegd left a comment

Choose a reason for hiding this comment

tfeher left a comment

Choose a reason for hiding this comment

achirkin commented Apr 26, 2021

codecov-commenter commented Apr 26, 2021

Codecov Report

tfeher left a comment

Choose a reason for hiding this comment

dantegd commented Apr 26, 2021

achirkin commented Apr 21, 2021 •

edited

Loading

achirkin commented Apr 21, 2021 •

edited

Loading