We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe the bug
One of the FIL tests produces errors in CI. Here is an example:
https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cuml/job/prb/job/cuml-gpu-test/CUDA=10.2,OS=centos7,PYTHON=3.8/111/console
This is not deterministic, I have seen 3x on CI. Once I could reproduce it locally. Several other times I could not reproduce it.
Steps/Code to reproduce bug
git clone https://github.com/tfeher/cuml.git nvidia-docker run --privileged -e HOST_USER_ID=0 -v$PWD:/mydata -w/mydata --rm -it rapidsai/rapidsai-dev-nightly:0.17-cuda10.2-devel-centos7-py3.8 mkdir cuml/cpp/build cd cuml/cpp/build cmake .. -DGPU_ARCHS=70 -DCMAKE_INSTALL_PREFIX=$CONDA_PREFIX make -j test/ml --gtest_filter=Fil*
This produces (non deterministically):
09:38:27 /opt/conda/envs/rapids/conda-bld/libcuml_1606459702665/work/cpp/test/sg/fil_test.cu:363: Failure 09:38:27 Value of: raft::devArrMatch(want_preds_d, preds_d, ps.num_rows, raft::CompareApprox<float>(tolerance), stream) 09:38:27 Actual: false (actual=2 != expected=0 @19182) 09:38:27 Expected: true 09:38:27 [ FAILED ] FilTests/PredictSparse16FilTest.Predict/15, where GetParam() = num_rows = 20000, num_cols = 50, nan_prob = 0.05, depth = 8, num_trees = 60, leaf_prob = 0.05, output = RAW, threshold = 0, blocks_per_sm = 0, algo = 1, seed = 42, tolerance = 0.002, op = <, global_bias = 0.5, leaf_algo = 2, num_classes = 6 (239 ms)
Tests FilTests/PredictSparse16FilTest.Predict/15 and FilTests/PredictSparse16FilTest.Predict/17 were reported as failing.
FilTests/PredictSparse16FilTest.Predict/15
FilTests/PredictSparse16FilTest.Predict/17
Additional information
FIL test temporarily disabled here: 5b64e25
The text was updated successfully, but these errors were encountered:
A similar problem is described in #3205. I'm closing this bug, and will track further work on this problem in #3205.
Sorry, something went wrong.
Closing this as duplicate of #3205.
Added a missing __syncthreads()(#3215)
5697e09
Added a missing `__syncthreads()`. - also re-enabled Sparse16 FIL tests - this should fix #3205 and #3206 Authors: - Andy Adinets <[email protected]> - John Zedlewski <[email protected]> - Dante Gama Dessavre <[email protected]> Approvers: - Thejaswi Rao - null URL: #3215
No branches or pull requests
Describe the bug
One of the FIL tests produces errors in CI. Here is an example:
https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cuml/job/prb/job/cuml-gpu-test/CUDA=10.2,OS=centos7,PYTHON=3.8/111/console
This is not deterministic, I have seen 3x on CI. Once I could reproduce it locally. Several other times I could not reproduce it.
Steps/Code to reproduce bug
This produces (non deterministically):
Tests
FilTests/PredictSparse16FilTest.Predict/15
andFilTests/PredictSparse16FilTest.Predict/17
were reported as failing.Additional information
FIL test temporarily disabled here: 5b64e25
The text was updated successfully, but these errors were encountered: