[REVIEW] Provide workaround for cupy.percentile bug #3315

wphicks · 2020-12-17T01:30:52Z

Ensure that the 100th quantile value returned by cupy.percentile is the maximum of the input array rather than (possibly) NaN due to cupy/cupy#4451. This eliminates an intermittent failure observed in tests of KBinsDiscretizer, which makes use of cupy.percentile. Note that this includes an alteration of the included sklearn code and should be reverted once the upstream cupy issue is resolved.

Resolve failure due to ValueError described in #2933.

Ensure that the 100th quantile value returned by cupy.percentile is the maximum of the input array rather than (possibly) NaN due to cupy/cupy#4451

wphicks · 2020-12-17T01:34:57Z

An open question for this PR is whether we should pull off the xfail on that KBinsDiscretizer test. I'm confident that the ValueError issue has been eliminated, but #2933 describes 2 other sporadic errors. I'll run 100K cycles tonight to try to see if those other errors are still reproducible. If not, I'd lean toward pulling off the xfail in this PR and then just keeping an eye out for those errors in CI, but I'm open to other opinions on that.

codecov-io · 2020-12-17T03:17:54Z

Codecov Report

Merging #3315 (ee8c1ef) into branch-0.18 (2e4388d) will decrease coverage by 0.03%.
The diff coverage is 92.88%.

@@               Coverage Diff               @@
##           branch-0.18    #3315      +/-   ##
===============================================
- Coverage        71.45%   71.42%   -0.04%     
===============================================
  Files              205      207       +2     
  Lines            16594    16741     +147     
===============================================
+ Hits             11858    11958     +100     
- Misses            4736     4783      +47

Impacted Files	Coverage Δ
python/cuml/neighbors/__init__.py	`100.00% <ø> (ø)`
python/cuml/neighbors/ann.pxd	`87.05% <87.05%> (ø)`
python/cuml/common/sparsefuncs.py	`91.66% <88.23%> (-2.34%)`	⬇️
python/cuml/linear_model/base.pyx	`97.36% <97.36%> (ø)`
python/cuml/neighbors/nearest_neighbors.pyx	`92.36% <98.24%> (+1.90%)`	⬆️
...hirdparty/sklearn/preprocessing/_discretization.py	`84.21% <100.00%> (+0.11%)`	⬆️
python/cuml/linear_model/elastic_net.pyx	`88.00% <100.00%> (-0.89%)`	⬇️
python/cuml/linear_model/lasso.pyx	`90.47% <100.00%> (-0.83%)`	⬇️
python/cuml/linear_model/linear_regression.pyx	`87.95% <100.00%> (-1.68%)`	⬇️
python/cuml/linear_model/ridge.pyx	`87.09% <100.00%> (-1.70%)`	⬇️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 34efaf8...ee8c1ef. Read the comment docs.

wphicks · 2020-12-17T15:09:30Z

My 100K run errored out because of how pytest-repeat gathers tests, but a 10K run passed without issue, and (conservatively) I've probably run this about 500K times since I last saw the other errors reported in #2933. I'm going to pull off the xfail in this PR unless reviewers talk me out of it, and we can see if anything turns up in CI down the line. My prejudice is that it's better to have the test in place and running some validation of KBinsDiscretizer, even if we're not absolutely certain that the other errors are gone.

dantegd

lgtm, just one small comment

python/cuml/_thirdparty/sklearn/preprocessing/_discretization.py

viclafargue

LGTM. Thank you for this in-depth investigation! Hope we will soon have a fix in cuPy.

Provide workaround for cupy.percentile bug

56830a4

Ensure that the 100th quantile value returned by cupy.percentile is the maximum of the input array rather than (possibly) NaN due to cupy/cupy#4451

wphicks requested a review from a team as a code owner December 17, 2020 01:30

wphicks added bug Something isn't working non-breaking Non-breaking change labels Dec 17, 2020

wphicks mentioned this pull request Dec 17, 2020

[BUG] Sporadic KBinsDiscretizer pytests fail with quantile strategy #2933

Open

wphicks requested a review from viclafargue December 17, 2020 01:37

No longer mark KBinsDiscretizer test as xfail

ee8c1ef

wphicks changed the title ~~Provide workaround for cupy.percentile bug~~ [REVIEW] Provide workaround for cupy.percentile bug Dec 17, 2020

dantegd approved these changes Dec 17, 2020

View reviewed changes

python/cuml/_thirdparty/sklearn/preprocessing/_discretization.py Show resolved Hide resolved

wphicks mentioned this pull request Dec 17, 2020

[TASK] Remove workaround for cupy.percentile issue after it is corrected #3317

Open

dantegd added the 5 - Ready to Merge Testing and reviews complete, ready to merge label Dec 17, 2020

viclafargue approved these changes Dec 17, 2020

View reviewed changes

dantegd added 6 - Okay to Auto-Merge and removed 5 - Ready to Merge Testing and reviews complete, ready to merge labels Dec 17, 2020

rapids-bot bot merged commit 550121b into rapidsai:branch-0.18 Dec 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REVIEW] Provide workaround for cupy.percentile bug #3315

[REVIEW] Provide workaround for cupy.percentile bug #3315

wphicks commented Dec 17, 2020 •

edited

Loading

wphicks commented Dec 17, 2020

codecov-io commented Dec 17, 2020 •

edited

Loading

wphicks commented Dec 17, 2020

dantegd left a comment

viclafargue left a comment

[REVIEW] Provide workaround for cupy.percentile bug #3315

[REVIEW] Provide workaround for cupy.percentile bug #3315

Conversation

wphicks commented Dec 17, 2020 • edited Loading

wphicks commented Dec 17, 2020

codecov-io commented Dec 17, 2020 • edited Loading

Codecov Report

wphicks commented Dec 17, 2020

dantegd left a comment

Choose a reason for hiding this comment

viclafargue left a comment

Choose a reason for hiding this comment

wphicks commented Dec 17, 2020 •

edited

Loading

codecov-io commented Dec 17, 2020 •

edited

Loading