-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync with upstream #20
Commits on Nov 18, 2020
-
`#include <cuml/manifold/umap.hpp>` works now. Co-authored-by: Corey J. Nolet <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 684b261 - Browse repository at this point
Copy the full SHA 684b261View commit details -
[REVIEW] Reorganize Pytest Config and Add Quick Run Option (#3137)
* Moving conftest.py files around and adding quick_run plugin * Adding PR to CHANGELOG * Incorporating feedback from code review
Configuration menu - View commit details
-
Copy full SHA for e377ae9 - Browse repository at this point
Copy the full SHA e377ae9View commit details -
[REVIEW] Add QuasiNewton tests (#3135)
* Initial cython test commit * Update changelog * Style fixes Co-authored-by: Nanthini Balasubramanian <[email protected]> Co-authored-by: Dante Gama Dessavre <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d8b4765 - Browse repository at this point
Copy the full SHA d8b4765View commit details
Commits on Nov 19, 2020
-
[REVIEW] Get rid of warnings in random projection test and related de…
…precation warnings (#3155) * Get rid of warnings in random projections test * Update changelog * Fix style * Update other deprecated make_blob imports
Configuration menu - View commit details
-
Copy full SHA for bb006a4 - Browse repository at this point
Copy the full SHA bb006a4View commit details -
Force local install by specifying exact build string (#3156)
* FIX Force local install by specifying exact build string * DOC Update changelog * Update ci/gpu/build.sh Co-authored-by: AJ Schmidt <[email protected]> Co-authored-by: AJ Schmidt <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 0d7b66c - Browse repository at this point
Copy the full SHA 0d7b66cView commit details -
[REVIEW] Update flake8 Config To With Per File Settings (#3002)
* Update flake8 config to join python/cython configuration and improve setup to check __init__.py files * Fixing linting issues in previously ignored __init__.py files * Update flake8 config to join python/cython configuration and improve setup to check __init__.py files * Fixing linting issues in previously ignored __init__.py files * Adding PR to CHANGELOG * Incorporating feedback from code review * Fixing style issues after merge with branch-0.17 Co-authored-by: Corey J. Nolet <[email protected]> Co-authored-by: Dante Gama Dessavre <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 3974810 - Browse repository at this point
Copy the full SHA 3974810View commit details -
[REVIEW] Adding Ability to Set Arbitrary Cmake Flags in ./build.sh [s…
…kip-ci] (#3144) * Adding ability to set arbitrary cmake flags in ./build.sh via the $CUML_ADDL_CMAKE_ARGS variable * Adding PR to CHANGELOG * Adding more help info requested from code review. Co-authored-by: John Zedlewski <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d30edd9 - Browse repository at this point
Copy the full SHA d30edd9View commit details -
[REVIEW] Sparse KNN + UMAP Sparse Inputs (#2836)
* Adding brute force knn shell to sparse * Stubbing out algorithm flow * Adding initial headers to wrapper * Performing idx batching * Starting to full in cusparse calls * Checking in * Beginning to add selection kernel * Finished header * Updates. Need to finish populating merge buffer * Using block select for selecting k and using 3-partition merge buffer * Logic is just about done. * Checking in changes. Need to swap out cuda 11 cusparse calls for cuda 10.2 version * Everything is building. Need end-to-end test * Running clang format * Updating changelog * Using raft's cusparse_wrappers.h instead of cuml * Removing cuda11-required GEMM calls (commenting them out for now, will swap them out shortly) * Fixing clang style * Separating distance computation from selection from general brute force algorithm to make pieces more reusable * Updating clang style * Adding batcher to help ease batch state management * Fixing clang style * MOre clang fixes * IP distance is computed using search * index.T. * Making type template for value_t all the way through knn_merge_parts * Adding simple googletest for sparse pairwise dists. The transpose conversion seems super expensive, but maybe it's necessary. * Completing test for basic inner product distances * Removing prints from test * Cleaning up batching for knn. Ready to gtest * KNN w/ max inner product is working * Adding guts of expanded l2 computation. * Cleaning up some debug prints * Fixing clang format * More cleanup and clang style fix * Fixing style for sparse knn prim test * Hoping i've captured all the clang updates * Updating per include_checker * I feel like I"m bouncing back and forth between clang and include checker * Refactoring sparse pairwise dists to return dense outputs * Beginning python layer * iAdding python layer for sparse inputs to nearest neighbors * End to end sparse knn works. Need to finish norms for expanded euclidean and expose it. * Removing unused file * Adding gtest for expanded l2. * Sparse l2 matches sklearn * Fixing clang format style * Fixing dstyle in gtests * Lots of changes and cleanup. Still need to flip the batching * Progress on tiling. Still a failure when tile sizes don't match up. * Tiling w/ uneven batch sizes works! Now just need to figure out what to do when the leftover values are <k * Some further optinmizations are necessary, but this works for now. * Ready for cleanup * Parametrizing sparse knn tests * More cleanup. * Fixing clang format * Fixing clang format style * Fixing flake8 for sparse nn tests * Fixing googletests * More cleanup of sparse knn * Adding sparse support to UMAP by abstracting the inputs * Everything's building. Have one template issue to fix in the sparse knn * Updates to API * Usig a struct to manage the knn graph output state * C++ side is largely done. Still need to figure out what to do w/ the separate int64_t type in the sparse knn * Removing examples/comms, which seems to have gotten re-checked in by mistake * Fixing c++ style * Fixing include checks * This darn style checker is going to kill me..... * Adding template type params for output * UMAP is officially accepting sparse inputs * More cleanup * Cleaning up gtests and making them easier to write * Fixing up and parametrizing tests * Fixing style * Fixing python style * More clang format style fixes * Pulled umap inputs classes to more shared location so tsne can use them. Added kselection gtest * Updating clang format * Fixing bad ide refactor * Updating changelog * Fixing more clang format * Fixing flake8 style. Not sure why these didn't show up locally * Decomposing sparse knn into a class. * Review feedback * Better umap sparse test * More testing updates * Adding docs to some of the remaining prims in csr.cuh * Adding gtests for transpose and row slice. Need to add one for todense * GTest for csr to dense * Fixing style * Removing debug logging from new gtests * Fixing flake8 style * Getting build to pass * Running clang-tidy * Fixing format for sparse gtests * Adding 'algo_params' to get_param_names() * Removing cumlarray output in kneighbors * Finishing review feedback * Fixing style * Fixing format * clang-format * Style changes * More review updates * Style updates * Running clang format on distance.cuh * Runing clang format on tests * Fixing cython style * Updating RAFT commit * Updating neighbors from bad merge
Configuration menu - View commit details
-
Copy full SHA for b205e8f - Browse repository at this point
Copy the full SHA b205e8fView commit details -
[Breaking] Add min_samples_split + Rename min_rows_per_node -> min_sa…
…mples_leaf (#3132) * Enforce min_rows_per_node in experimental RF backend * Add min_samples_split hyperparameter * Use correct definition of min_samples_split * Rename range_len -> n_samples * Add min_samples_split to Dask docstring * Rename min_rows_per_node -> min_samples_leaf * Update docstring for min_samples_leaf * Correctly apply min_samples_split in new RF backend * Address reviewer's comment * Fix broken tests in BatchedLevelAlgo/DtRegTestF.Test * Adjust accuracy requirement in test RFBatchedRegTests/RFBatchedRegTestF.Fit/5 * Add unit tests for min_samples_split, min_samples_leaf * Add descriptive comments for compound literals * Fix formatting * Add changelog * Organize unit tests under prefix BatchedLevelAlgoUnitTest * Change default value for min_samples_leaf to 1 * Deprecate min_rows_per_node; guide users to use min_samples_leaf * Fix style error
Configuration menu - View commit details
-
Copy full SHA for b7bfb7e - Browse repository at this point
Copy the full SHA b7bfb7eView commit details
Commits on Nov 20, 2020
-
[REVIEW][PROPOSAL] Add tags and prefered memory order tags to estimat…
…ors (#3113) * FEA Add preferred_order class parameter to linear models * ENH adopt tags from scikit-learn API to support preferred order attribute * DOC remove attribute docstrings * FIX Change straggling classes * FIX Change straggling classes * FIX Add missing self * FIX straggling attribute * ENH Add device data tag for proposal * FEA Add all scikit-learn API tags to base and improve gpu input types tag * FEA Add preferred_order tag to cluster models * FEA Add preferred_order tag to most models * ENH Improvements and PR review feedback * DOC add tag documentation to estimator guide * DOC add scikit link * Update wiki/python/ESTIMATOR_GUIDE.md Co-authored-by: Corey J. Nolet <[email protected]> * Update wiki/python/ESTIMATOR_GUIDE.md Co-authored-by: Corey J. Nolet <[email protected]> * Update wiki/python/ESTIMATOR_GUIDE.md Co-authored-by: Corey J. Nolet <[email protected]> * Update wiki/python/ESTIMATOR_GUIDE.md Co-authored-by: Corey J. Nolet <[email protected]> * Update wiki/python/ESTIMATOR_GUIDE.md Co-authored-by: Corey J. Nolet <[email protected]> * ENH Rename test_fit to test_api and add tags tests * FIX fixes from PR review * DOC Added entry to changelog * FIX PEP8 fixes Co-authored-by: Corey J. Nolet <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b3e4827 - Browse repository at this point
Copy the full SHA b3e4827View commit details -
[REVIEW] Removing extra unneeded file (#3162)
* Removing extra unneeded file * Updating changelog
Configuration menu - View commit details
-
Copy full SHA for 2877fb9 - Browse repository at this point
Copy the full SHA 2877fb9View commit details -
[REVIEW] Fix access to attributes of individual NB objects in dask NB (…
…#3152) * FIX Access to attributes of individual NB objects in dask NB * DOC Added entry to changelog * ENH Add pytest * FIX PEP8 fixes Co-authored-by: John Zedlewski <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 1078746 - Browse repository at this point
Copy the full SHA 1078746View commit details -
[REVIEW] FIL: Add optimization parameter
blocks_per_sm
for all but ……the tiniest models (#3032) * just control block count * blocks_per_sm can now be passed through treelite_params_t or forest_params_t * changelog * made blocks_per_sm mandatory; added tests; fixed a bug * changelog * added tests, moved __syncthreads() to common for all acc's, removed most blockIdx.x uses * removed blocks_per_sm from python API, to avoid a longer discussion on best set * simplified output loops * addressed other review comments * fixed bad merge conflict resolution * comment for blocks_per_sm in fil.pyx * style
Configuration menu - View commit details
-
Copy full SHA for 2a84c52 - Browse repository at this point
Copy the full SHA 2a84c52View commit details
Commits on Nov 21, 2020
-
[REVIEW] FIL: use tree reduction for GROVE_PER_CLASS_FEW_CLASSES (#2988)
* binary reduction: half way there * quaternary reduction * changelog * remove accidental files * generalize the multireduction * adding dedicated tests for multireduction; style * change trap; into setting an atomic. * split into n tests, one per size * ? * tried thrust + rmm, no rmm dependency in tests it seems * no rmm, sync allocations * style * fixed some testing bugs; expanded test to all block sizes; better documentation * fixed wrong test * simplify comparison * member -> non-member function pointer as test template argument * style * replaced reduction with simpler code; tuned radix towards fewer classes * fixed compile dependency and runtime discrepancy * long comment line * fix build issues * Apply suggestions from code review Co-authored-by: Andy Adinets <[email protected]> * addressed review comments Co-authored-by: Andy Adinets <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b437b66 - Browse repository at this point
Copy the full SHA b437b66View commit details
Commits on Nov 23, 2020
-
[REVIEW] link Logisitc MNMG via dask-glm demo to readme (#3151)
* add dask-glm demo link * add to changelog Co-authored-by: Corey J. Nolet <[email protected]> Co-authored-by: Dante Gama Dessavre <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 54fe7b3 - Browse repository at this point
Copy the full SHA 54fe7b3View commit details -
Add 0.15 and 0.16 release dates (#3054)
Updated with 0.15 and 0.16 release dates. Co-authored-by: Corey J. Nolet <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 233b770 - Browse repository at this point
Copy the full SHA 233b770View commit details
Commits on Nov 24, 2020
-
[WIP] Remove previously-replaced metrics.cu file (#3179)
* Remove outdated, extraneous file * Update changelog
Configuration menu - View commit details
-
Copy full SHA for ae55470 - Browse repository at this point
Copy the full SHA ae55470View commit details -
[REVIEW] Expose silhouette score in Python (#3164)
* Expose silhouette score in Python * Style fix * Correct dtypes used in silhouette_score * Update changelog * Fix style * Update linebreaks * Add copyright headers * Collapse Python silhouette_score to single file * Restructure silhouette_score for consistency * Fix style * Loosen silhouette score test tolerance
Configuration menu - View commit details
-
Copy full SHA for 45aa019 - Browse repository at this point
Copy the full SHA 45aa019View commit details -
[REVIEW] Fix gtest pinned cmake version for build from source option …
…[skip-ci] (#3175) * FIX Fix gtest pinned cmake version for build from source option * DOC Added entry to changelog
Configuration menu - View commit details
-
Copy full SHA for 37960f3 - Browse repository at this point
Copy the full SHA 37960f3View commit details -
[REVIEW] Add probabilistic SVM tests with various input array types (#…
…3176) * Add probabilistic SVM tests with various input array types * DOC update changelog
Configuration menu - View commit details
-
Copy full SHA for e7f6bdf - Browse repository at this point
Copy the full SHA e7f6bdfView commit details -
Fix a bug in MSE metric calculation (#3182)
* Fix a bug in MSE metric calculation * Style fix * Add changelog * Try smaller grid dimensions
Configuration menu - View commit details
-
Copy full SHA for 55aace9 - Browse repository at this point
Copy the full SHA 55aace9View commit details -
[REVIEW] blocks_per_sm FIL parameter in Python. (#3180)
* blocks_per_sm FIL parameter in Python. * Updated CHANGELOG.md. * Fixed style errors. * Reduced the number of parameter combinations in the Python test.
Configuration menu - View commit details
-
Copy full SHA for f1abcb8 - Browse repository at this point
Copy the full SHA f1abcb8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 56b294d - Browse repository at this point
Copy the full SHA 56b294dView commit details
Commits on Nov 30, 2020
-
Configuration menu - View commit details
-
Copy full SHA for 1369aad - Browse repository at this point
Copy the full SHA 1369aadView commit details -
[REVIEW] Estimator Pickling Demo & Adding to Docs (#3154)
* Adding simple dask estimator notebook to demonstrate saving/loading * Renaming and updating cells * Updating source.rst * Updating changelog * Updating pickling notebook * Review updates * More review feedback Co-authored-by: John Zedlewski <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 63c8a44 - Browse repository at this point
Copy the full SHA 63c8a44View commit details -
[REVIEW] MNMG KNN Cl&Re fix + multiple improvements (#3051)
* Fix + multiple improvements * Update changelog * Update model output and testing * Check style update * Update comments * Test one query partition * Check style
Configuration menu - View commit details
-
Copy full SHA for 4ce9b41 - Browse repository at this point
Copy the full SHA 4ce9b41View commit details -
[REVIEW] Disable ascending=false path for sortColumnsPerRow and spora…
…dically-failing FIL test [skip ci] (#3196) * Disable ascending=false path for sortColumnsPerRow * DOC Update chanegelog * Disable flaky FIL test Co-authored-by: John Zedlewski <[email protected]> Co-authored-by: John Zedlewski <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 8e8f965 - Browse repository at this point
Copy the full SHA 8e8f965View commit details
Commits on Dec 1, 2020
-
[REVIEW] FIX Fix EXITCODE override in test_notebooks script (#3208)
* FIX Fix EXITCODE override in test_notebooks script * DOC Changelog update * FIX Move bash trap to after the GTests so they fail immediately * FIX Move codecov block to gpu build
Configuration menu - View commit details
-
Copy full SHA for 9c8dade - Browse repository at this point
Copy the full SHA 9c8dadeView commit details -
[REVIEW] Fix cuDF to cuPy conversion (missing value) (#3194)
* Fix cuDF to cuPy conversion (missing value) * Changelog update * Introducing fail_on_nan parameter * Adding test with fail_on_nan=True * Updating conversion * Rename fail_on_nan into fail_on_null
Configuration menu - View commit details
-
Copy full SHA for 4a0e67a - Browse repository at this point
Copy the full SHA 4a0e67aView commit details -
Fix Attribute error on ICPA #3183 and PCA input type(#3190)
This PR is fixing the attribute error of #3183, and additional bugs on the input type of PCA (`sparse_scipy_to_cp()` function call missed an argument) and on the shape of `self.singular_values_`. I am also adding additional tests on the bug fixed here. Authors: - Mickael Ide <[email protected]> - John Zedlewski <[email protected]> Approvers: - Divye Gala - John Zedlewski URL: #3190
Configuration menu - View commit details
-
Copy full SHA for 4b2aaae - Browse repository at this point
Copy the full SHA 4b2aaaeView commit details -
Set absolute tolerance to improve silhouette_score test consistency(#…
…3214) Add atol parameter to silhouette_score test to ensure consistent test behavior Authors: - William Hicks <[email protected]> Approvers: - John Zedlewski URL: #3214
Configuration menu - View commit details
-
Copy full SHA for 397122e - Browse repository at this point
Copy the full SHA 397122eView commit details -
Add gain to RF JSON dump(#3186)
I found it helpful when debugging the MSE metric calculation in random forest. Gain = Change in the metric (MSE / MAE / Gini / Entropy) that's attributed directly to each internal node (split). Authors: - Hyunsu Cho <[email protected]> Approvers: - John Zedlewski URL: #3186
Configuration menu - View commit details
-
Copy full SHA for 398200f - Browse repository at this point
Copy the full SHA 398200fView commit details -
Add documentation for Distributed TFIDF Transformer(#3185)
This PR fixes #3173 . With this PR it renders like below locally for me. ![image](https://user-images.githubusercontent.com/4837571/100154855-22761e00-2e5b-11eb-9be3-173e1a53ad08.png) Authors: - Vibhu Jawa <[email protected]> - Vibhu Jawa <[email protected]> - Corey J. Nolet <[email protected]> Approvers: - John Zedlewski URL: #3185
Configuration menu - View commit details
-
Copy full SHA for 13600f7 - Browse repository at this point
Copy the full SHA 13600f7View commit details -
Merge pull request #3217 from rapidsai/branch-0.17
[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]
Configuration menu - View commit details
-
Copy full SHA for e8a484b - Browse repository at this point
Copy the full SHA e8a484bView commit details -
Make Multinomial Naive Bayes inherit from ClassifierMixin(#3177)
To match match the other classifiers, this makes Multinomial Naive Bayes inherit from ClassifierMixin and use its score method Passes Naive Bayes tests locally This closes #2614 Authors: - Chris Jarrett <[email protected]> - ChrisJar <[email protected]> Approvers: - Corey J. Nolet URL: #3177
Configuration menu - View commit details
-
Copy full SHA for 6cc3d1f - Browse repository at this point
Copy the full SHA 6cc3d1fView commit details -
Merge pull request #3222 from rapidsai/branch-0.17
[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]
Configuration menu - View commit details
-
Copy full SHA for fefe83c - Browse repository at this point
Copy the full SHA fefe83cView commit details -
Added a missing __syncthreads()(#3215)
Added a missing `__syncthreads()`. - also re-enabled Sparse16 FIL tests - this should fix #3205 and #3206 Authors: - Andy Adinets <[email protected]> - John Zedlewski <[email protected]> - Dante Gama Dessavre <[email protected]> Approvers: - Thejaswi Rao - null URL: #3215
Configuration menu - View commit details
-
Copy full SHA for 5697e09 - Browse repository at this point
Copy the full SHA 5697e09View commit details -
Merge pull request #3225 from rapidsai/branch-0.17
[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]
Configuration menu - View commit details
-
Copy full SHA for 0d47fb7 - Browse repository at this point
Copy the full SHA 0d47fb7View commit details -
[REVIEW] Least Angle Regression (#3160)
* FEA Lars solver * Fix scaling of X and alpha, use n_cols in calcMaxStep, use binaryOp with alignment check * Improve memory error handling Additionally adjust Gram matrix condition for precompute='auto' * DOC undo whitespace edit in changelog * Add eps parameter and improve numeric error handling * Fix include style * Set coef_path[:,0] to zeros * Add more extensive tests * Move LARS to experimantal namespace * Remove unused imports * FEA Detect and avoid collinear features * DOC fix docstrings * Various improvements - Convert input to fp64 to avoid problem with fp32 input - Improved debug logs - Added cpp unit test with n_rows = 65536 - Avoid error during CUDA kernel calls if n_active == 0 - Correct indexing error for x_scale - Test normalize param - Move precomputed Gram wrapping to the main fit method * Correct docstring and Python style * DOC Remove stray comma that triggered doxygen error * Update RAFT GIT_TAG * Correct __init__.py after moving LARS to experimental namespace * Fix implicit type conversion error and enable FP32 training * Fix base parameter docs and get_param_names * Define explicit dtype for the intercept attribute * Improve Lars test coverage, and decrease test tolerance Co-authored-by: John Zedlewski <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for da25d82 - Browse repository at this point
Copy the full SHA da25d82View commit details -
Merge pull request #3226 from rapidsai/branch-0.17
[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]
Configuration menu - View commit details
-
Copy full SHA for 5492665 - Browse repository at this point
Copy the full SHA 5492665View commit details
Commits on Dec 2, 2020
-
Specify branches to avoid pip resolver issue(#3218)
Avoid dependency resolution failure in latest version of pip by explicitly specifying versions for dask and distributed Resolve #3210 Authors: - William Hicks <[email protected]> Approvers: - John Zedlewski - Dillon Cullinan - AJ Schmidt URL: #3218
Configuration menu - View commit details
-
Copy full SHA for 4ae84f4 - Browse repository at this point
Copy the full SHA 4ae84f4View commit details -
Merge pull request #3227 from rapidsai/branch-0.17
[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]
Configuration menu - View commit details
-
Copy full SHA for e4b6ced - Browse repository at this point
Copy the full SHA e4b6cedView commit details -
Increase default SVM kernel cache to 1 GiB(#3223)
Closes #1558. Authors: - Tamas Bela Feher <[email protected]> Approvers: - John Zedlewski URL: #3223
Configuration menu - View commit details
-
Copy full SHA for a5889a1 - Browse repository at this point
Copy the full SHA a5889a1View commit details -
Merge pull request #3229 from rapidsai/branch-0.17
[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]
Configuration menu - View commit details
-
Copy full SHA for 52e8cac - Browse repository at this point
Copy the full SHA 52e8cacView commit details -
[REVIEW] Experimental versions of GPU accelerated Kernel and Permutat…
…ion SHAP (#3126) * FEA Separate kernel shap from shared shap branch for PR * FIX Typos * FIX Typos * FEA Add files to cmakelists * ENH small corrections and started incorporating PR review feedback * ENH progress on remaining todos * [FIX] typo Co-authored-by: John Zedlewski <[email protected]> * FIX Small function typo * ENH Multiple small enhancements and fixes * ENH Use tags for device model detection * ENH data type changes * ENH Add pytest files * ENH multiple enhancements, completed todos and fixes * ENH naming, comments and code enhancements to C++ code * ENH clang-format cleanup * ENH variable rename for clarity * ENH Add explainer common pytests * ENH Use raft handle device properties * ENH Many more enhancements, better weighter linear regression * ENH Add googletest and c++ improvements from PR feedback * ENH clang-format and comments about the tests * FIX remove straggling prints * FIX Uncomment all other c++ tests * ENH Multiple small python enhancements and bugfixes * ENH More python small improvements, rename class to match mainline * ENH Big python code cleanup and incorporating PR feedback. New SHAPBase class * ENH Incorporate rest of feedback of KernelSHAP and Base * ENH Add full coverage to explainer common tests * ENH Small numeric and other enhancements * ENH Multiple enhancements including coalesced kernel, generating samples by compliments, googletest changes to account for that * FIX clang format fixes * FEA Improvements to pytests * ENH More python enhancements and simplify perm SHAP to use SHAPBase class * DOC Added entry to changelog * ENH Various small style fixes, doc fixes, tidying up straggling comments * FIX PEP8 fixes * FIX test margins that I had forgotten to adjust, some might still be tighter than needed * FIX add missing stream sync to test and print in case of failure * FIX always run clang-format I keep telling myself... * FIX Small type correction that seems to be the root of the googletest fail only on cuda 10.1 * FIX temporarily disable specific googletest for 0.17 burndown * FIX Had disabled the test in the incorrect place :( * FIX remove straggling prints * Update cpp/src/explainer/kernel_shap.cu Co-authored-by: John Zedlewski <[email protected]> * Update python/cuml/common/import_utils.py Co-authored-by: John Zedlewski <[email protected]> * Update python/cuml/experimental/explainer/base.py Co-authored-by: John Zedlewski <[email protected]> * ENH incorporating PR review feedback * FIX clang format fixes * FIX reduce test size and case matrix of test that was slow in CI * FIX c++ docstring fix Co-authored-by: John Zedlewski <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 936995f - Browse repository at this point
Copy the full SHA 936995fView commit details -
Merge pull request #3230 from rapidsai/branch-0.17
[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]
Configuration menu - View commit details
-
Copy full SHA for 2efe3ed - Browse repository at this point
Copy the full SHA 2efe3edView commit details -
[REVIEW] Update contributing doc for new label approach [skip-ci] (#3221
) * Update contrib doc for labels * Update changelog Co-authored-by: Dante Gama Dessavre <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 75979a6 - Browse repository at this point
Copy the full SHA 75979a6View commit details -
Ignore splits that do not satisfy constraints (#3216)
* Ignore splits that do not satisfy constraints * Fix giniGain() * Fix all other metrics * Add changelog * Use base 2 log in the entropy metric * Fix style check
Configuration menu - View commit details
-
Copy full SHA for 2284e8e - Browse repository at this point
Copy the full SHA 2284e8eView commit details -
[REVIEW] Update to XGBoost 1.3.0rc1 [skip-ci] (#3219)
* Update to xgboost 1.3rc1 * Update changelog
Configuration menu - View commit details
-
Copy full SHA for 0ae1bea - Browse repository at this point
Copy the full SHA 0ae1beaView commit details -
Merge pull request #3235 from rapidsai/branch-0.17
[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]
Configuration menu - View commit details
-
Copy full SHA for 0626fce - Browse repository at this point
Copy the full SHA 0626fceView commit details -
[REVIEW] Fix __repr__ function for preprocessing models (#3191)
* Fix __repr__ for preprocessing models * Changelog update * Testing that __repr__ successfully returns for all models
Configuration menu - View commit details
-
Copy full SHA for 966b85c - Browse repository at this point
Copy the full SHA 966b85cView commit details -
Merge pull request #3236 from rapidsai/branch-0.17
[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]
Configuration menu - View commit details
-
Copy full SHA for f9f5452 - Browse repository at this point
Copy the full SHA f9f5452View commit details -
Update docstring to document behavior of bootstrap=False [skip-ci] (#…
…3187) * Update docstring to document behavior of bootstrap=False * Add changelog * Apply suggestions from code review Co-authored-by: John Zedlewski <[email protected]> Co-authored-by: John Zedlewski <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a0f4549 - Browse repository at this point
Copy the full SHA a0f4549View commit details -
Merge pull request #3238 from rapidsai/branch-0.17
[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]
Configuration menu - View commit details
-
Copy full SHA for 40d9568 - Browse repository at this point
Copy the full SHA 40d9568View commit details
Commits on Dec 3, 2020
-
Minor doc updates for 0.17(#3240)
Includes some readme cleanup, adding things to README and api.rst. There are still minor rest warnings outstanding in FIL and SHAP that I'm looking into. Authors: - John Zedlewski <[email protected]> - Michael Demoret <[email protected]> - John Zedlewski <[email protected]> Approvers: - Corey J. Nolet - Michael Demoret URL: #3240
Configuration menu - View commit details
-
Copy full SHA for a76d80f - Browse repository at this point
Copy the full SHA a76d80fView commit details -
Merge pull request #3242 from rapidsai/branch-0.17
[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]
Configuration menu - View commit details
-
Copy full SHA for ff17722 - Browse repository at this point
Copy the full SHA ff17722View commit details -
Fix intermittent dask random forest failure(#3239)
Adding `n_classes=1` when `n_samples=1` to ensure that class is always labelled 0. Closes #3202 Authors: - Nanthini Balasubramanian <[email protected]> Approvers: - John Zedlewski URL: #3239
Configuration menu - View commit details
-
Copy full SHA for 1287e86 - Browse repository at this point
Copy the full SHA 1287e86View commit details -
Merge pull request #3244 from rapidsai/branch-0.17
[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]
Configuration menu - View commit details
-
Copy full SHA for 50df6f6 - Browse repository at this point
Copy the full SHA 50df6f6View commit details -
updating raft to latest(#3241)
Authors: - divyegala <[email protected]> - John Zedlewski <[email protected]> Approvers: - Dante Gama Dessavre - Dante Gama Dessavre URL: #3241
Configuration menu - View commit details
-
Copy full SHA for b7c1110 - Browse repository at this point
Copy the full SHA b7c1110View commit details -
Merge pull request #3248 from rapidsai/branch-0.17
[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]
Configuration menu - View commit details
-
Copy full SHA for ac70782 - Browse repository at this point
Copy the full SHA ac70782View commit details -
Rename rows_sample -> max_samples in RF to be consistent with sklearn…
…'s RF(#3245) Rename rows_sample -> max_samples to be consistent with sklearn's RF. From https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html: > **max_samples**: int or float, default=None > If bootstrap is True, the number of samples to draw from X to train each base estimator. > If None (default), then draw X.shape[0] samples. > If int, then draw max_samples samples. > If float, then draw max_samples * X.shape[0] samples. Thus, max_samples should be in the interval (0, 1). > New in version 0.22. Authors: - Hyunsu Cho <[email protected]> Approvers: - John Zedlewski URL: #3245
Configuration menu - View commit details
-
Copy full SHA for d0cd8c1 - Browse repository at this point
Copy the full SHA d0cd8c1View commit details -
Merge pull request #3249 from rapidsai/branch-0.17
[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]
Configuration menu - View commit details
-
Copy full SHA for 04c2771 - Browse repository at this point
Copy the full SHA 04c2771View commit details
Commits on Dec 4, 2020
-
Avoid unnecessary split for degenerate case where all labels are iden…
…tical(#3243) Closes #3231 Closes #3128 Partially addresses #3188 The degenerate case (labels all identical in a node) is now robustly handled, by computing the MSE metric separately for each of the three nodes (the parent node, the left child node, and the right child node). Doing so ensures that the gain is 0 for the degenerate case. The degenerate case may occur in some real-world regression problems, e.g. house price data where the price label is rounded up to nearest 100k. As a result, the MSE gain is computed very similarly as the MAE gain. Disadvantage: now we always make two passes over data to compute the gain. cc @teju85 @vinaydes @JohnZed Authors: - Hyunsu Cho <[email protected]> - Philip Hyunsu Cho <[email protected]> Approvers: - Thejaswi Rao - John Zedlewski URL: #3243
Configuration menu - View commit details
-
Copy full SHA for a4c8de5 - Browse repository at this point
Copy the full SHA a4c8de5View commit details -
Merge pull request #3254 from rapidsai/branch-0.17
[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]
Configuration menu - View commit details
-
Copy full SHA for 89e4bd3 - Browse repository at this point
Copy the full SHA 89e4bd3View commit details -
Fixing label binarizer bug with multiple partitions(#3250)
Authors: - Corey J. Nolet <[email protected]> - Corey J. Nolet <[email protected]> Approvers: - John Zedlewski URL: #3250
Configuration menu - View commit details
-
Copy full SHA for c771926 - Browse repository at this point
Copy the full SHA c771926View commit details -
Merge pull request #3262 from rapidsai/branch-0.17
[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]
Configuration menu - View commit details
-
Copy full SHA for 91fb062 - Browse repository at this point
Copy the full SHA 91fb062View commit details
Commits on Dec 5, 2020
-
[REVIEW] Hide silhouette_score Python binding due to memory issue (#3258
) * Hide silhouette_score Python binding Remove this feature due to memory issues in C++ implementation for anything but modest numbers of samples * Remove silhouette_score tests * Update changelog * Remove unused import * Remove silhouette_score from new features list * Add note on reason for hiding silhouette_score * Update docstrings with silhouette_score warning Also remove sillhouette_score from api.rst docs * Update CHANGELOG to restore reference to reverted PR
Configuration menu - View commit details
-
Copy full SHA for 40b089b - Browse repository at this point
Copy the full SHA 40b089bView commit details -
Merge pull request #3263 from rapidsai/branch-0.17
[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]
Configuration menu - View commit details
-
Copy full SHA for e83d6a1 - Browse repository at this point
Copy the full SHA e83d6a1View commit details -
Fix MNMG KNN doc (adding batch_size)(#3246)
Answers #3232. Explicitly specify `batch_size` as parameter to MNMG KNN models in order to make it visible in the documentation. Authors: - viclafargue <[email protected]> - Corey J. Nolet <[email protected]> Approvers: - Corey J. Nolet - John Zedlewski URL: #3246
Configuration menu - View commit details
-
Copy full SHA for d063ca4 - Browse repository at this point
Copy the full SHA d063ca4View commit details -
Merge pull request #3265 from rapidsai/branch-0.17
[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]
Configuration menu - View commit details
-
Copy full SHA for 54ea23c - Browse repository at this point
Copy the full SHA 54ea23cView commit details
Commits on Dec 9, 2020
-
Add secondary test to kernel explainer pytests for stability in Volta (…
…#3282) * FIX Add secondary test to kernel explainer pytests for stability in Volta * DOC Added entry to changelog * FIX PR review feedback
Configuration menu - View commit details
-
Copy full SHA for 5ff2794 - Browse repository at this point
Copy the full SHA 5ff2794View commit details -
Merge pull request #3286 from rapidsai/branch-0.17
[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]
Configuration menu - View commit details
-
Copy full SHA for f02dbe6 - Browse repository at this point
Copy the full SHA f02dbe6View commit details -
[REVIEW] Correct pure virtual declaration in manifold_inputs_t (#3279)
* Correct pure virtual declaration in manifold_inputs_t * Update changelog
Configuration menu - View commit details
-
Copy full SHA for 0aad11c - Browse repository at this point
Copy the full SHA 0aad11cView commit details
Commits on Dec 11, 2020
-
Remove unused keyword in PorterStemmer code(#3289)
Remove keyword "stops" from call to cudf.core.column.string.slice, which no longer accepts arbitrary keywords. cuDF change introduced in rapidsai/cudf#6750. Authors: - William Hicks <[email protected]> Approvers: - John Zedlewski - Micka URL: #3289
Configuration menu - View commit details
-
Copy full SHA for e495dbe - Browse repository at this point
Copy the full SHA e495dbeView commit details -
Fix SVR unit test parameter(#3294)
Linear SVR has the coef_ attribute in the python layer. In the C++ unit test the same vector is denoted by _w_, and it is defined as a linear combination of the support vectors ![image](https://user-images.githubusercontent.com/3671106/101908077-ce3d9e80-3bbb-11eb-98ff-e7be90828dde.png) The number of elements in _w_ is n_cols. One of the SVR tests only defined 1 expected value for _w_, instead of the expected n_cols=2 values, which lead to accessing an uninitialized value. This would fail the test unless the value is accidentally zero initialized. Surprisingly this happened extremely rarely. This PR fixes the expected value _w_exp_. Authors: - Tamas Bela Feher <[email protected]> Approvers: - Dante Gama Dessavre URL: #3294
Configuration menu - View commit details
-
Copy full SHA for 2e4388d - Browse repository at this point
Copy the full SHA 2e4388dView commit details -
Add KNN parameter to t-SNE(#2592)
Closes #1780 Adding kNN graph input functionality to t-SNE, a request broken off of the issue #1733. t-SNE gathers kNN indices and distances in the first stage of it's computation, by allowing the user to input their own kNN graph, they can skip this step. This should follow #1815 as closely as possible. **Benefits of this**: - allow user custom run of kNN algorithm - can use different distance function instead of t-SNE euclidean default - allows for speedup if performing grid search by storing and reusing kNN graph **Includes:** - [x] Abstracted `extract_knn_graph` so it can be used for both UMAP and t-SNE - [x] Implemented kNN graph input to Python/Cython layer and C++/CUDA layer - [x] C++/CUDA Barnes Hut and Exact t-SNE tests - [x] Python t-SNE tests - [x] General code cleanup wherever needed Authors: - Aleksander Ficek <[email protected]> - Corey J. Nolet <[email protected]> - Ray Douglass <[email protected]> - Corey J. Nolet <[email protected]> Approvers: - Corey J. Nolet URL: #2592
Configuration menu - View commit details
-
Copy full SHA for 2e2125f - Browse repository at this point
Copy the full SHA 2e2125fView commit details -
Linear models predict function consolidation (#3256)
* FEA Consolidate linear model gemm based predicts on one function on C++ * FEA Consolidate linear model gemm based predicts on one function on Python * DOC Added entry to changelog * FIX PEP8 fixes * FIX Forgot clang-format * FIX Remove C++ sync calls and unnecessary delete on Python based on PR feedback * DOC Remove changelog entry
Configuration menu - View commit details
-
Copy full SHA for 9fa7112 - Browse repository at this point
Copy the full SHA 9fa7112View commit details
Commits on Dec 14, 2020
-
[REVIEW] Refactoring: move internal FIL interface to a separate file (#…
…3292) * Refactoring: move internal FIL interface to a separate file. - move the functions not related to treelite import, prediction or freeing the model to a separate file * Fixed style errors.
Configuration menu - View commit details
-
Copy full SHA for 62e2152 - Browse repository at this point
Copy the full SHA 62e2152View commit details -
Approximate Nearest Neighbors(#2780)
This PR will enable the usage of multiple KNN strategies as alternatives to the current default bruteforce method. See #574 Authors: - wxbn <[email protected]> - viclafargue <[email protected]> - Corey J. Nolet <[email protected]> Approvers: - Corey J. Nolet URL: #2780
Configuration menu - View commit details
-
Copy full SHA for bd43f32 - Browse repository at this point
Copy the full SHA bd43f32View commit details -
Add xfail on fetching 20newsgroup dataset (test_naive_bayes)(#3291)
This PR fixes CI fails that happen on `test_naive_bayes` when the machine can't download the 20 newsgroup dataset. It closes #3260 Authors: - Mickael Ide <[email protected]> Approvers: - John Zedlewski URL: #3291
Configuration menu - View commit details
-
Copy full SHA for 34efaf8 - Browse repository at this point
Copy the full SHA 34efaf8View commit details
Commits on Dec 16, 2020
-
Configuration menu - View commit details
-
Copy full SHA for 2b753d4 - Browse repository at this point
Copy the full SHA 2b753d4View commit details
Commits on Dec 17, 2020
-
[REVIEW] 018 add unfitted error pca & tests on IPCA (#3272)
* Adding NotFittedError to PCA * Fixed typo in PCA import * Fixed check_is_fitted call * Fixed missing parenthesis * Added test on svd_flip * fix style ipca * Fixed whitespace style * Removed useless test
Configuration menu - View commit details
-
Copy full SHA for 47b6296 - Browse repository at this point
Copy the full SHA 47b6296View commit details -
Removed FIL node types with
_t
suffix. (#3314)- only the node types without the `_t` suffix are now used - removed the functions necessary to handle node types with the `_t` suffix
Configuration menu - View commit details
-
Copy full SHA for 2316937 - Browse repository at this point
Copy the full SHA 2316937View commit details -
Provide workaround for cupy.percentile bug(#3315)
Ensure that the 100th quantile value returned by cupy.percentile is the maximum of the input array rather than (possibly) NaN due to cupy/cupy#4451. This eliminates an intermittent failure observed in tests of KBinsDiscretizer, which makes use of cupy.percentile. Note that this includes an alteration of the included sklearn code and should be reverted once the upstream cupy issue is resolved. Resolve failure due to ValueError described in #2933. Authors: - William Hicks <[email protected]> Approvers: - Dante Gama Dessavre - Victor Lafargue URL: #3315
Configuration menu - View commit details
-
Copy full SHA for 550121b - Browse repository at this point
Copy the full SHA 550121bView commit details -
Return confusion matrix as int unless float weights are used(#3275)
This PR aims at converting the confusion matrix to int when possible, to avoid the scientific notation when possible. See this example: ![image](https://user-images.githubusercontent.com/9810050/101400035-9808d200-38d0-11eb-9f81-4d217a5ff202.png) Authors: - Mickael Ide <[email protected]> - Mickael Ide <[email protected]> Approvers: - John Zedlewski URL: #3275
Configuration menu - View commit details
-
Copy full SHA for 756061e - Browse repository at this point
Copy the full SHA 756061eView commit details -
Remove static specifier in DecisionTree unit test for C++14 compliance(…
…#3281) Replace "constexpr static" member variables in DecisionTree unit test fixture with "const" member variables for compliance with C++14, which otherwise requires that const static data members be separately defined in a namespace scope if it is ODR-used (See sections 3.2 and 9.4.2 of the C++11 standard, which remain relevant until C++17). Authors: - William Hicks <[email protected]> Approvers: - Dante Gama Dessavre URL: #3281
Configuration menu - View commit details
-
Copy full SHA for ae7e444 - Browse repository at this point
Copy the full SHA ae7e444View commit details