sync with upstream #19

daxiongshu · 2020-11-17T21:59:38Z

No description provided.

* splitting `cpp/src/metrics.cu` into seperately compiled files * updated CHANGELOG.md * file-naming cleanup from camelCase to under_score * addings related changes from PR #3072 that affected this PR

* Speeding up MNMG KNN Cl&Re testing * Update changelog * Testing with extreme values

Fixes #3057 Co-authored-by: Corey J. Nolet <[email protected]>

* Use single random seed in kmeans tests * Prune redundant kmeans parameterization tests * Update changelog * Add extra k-means|| test Co-authored-by: Dante Gama Dessavre <[email protected]>

* Speed up test_lightgbm * Speed up test_fil_regression * Update changelog * Test FIL predict() with binary classifier * Add a TODO comment * Explicitly indicate skipped tests in test_fil_skl_classification * Test n_classes=25 with n_estimators=1 * Address reviewer's feedback * Fix style

…e to underscore format (#3065) * splitting `cpp/src/metrics.cu` into seperately compiled files * updated CHANGELOG.md * file-naming cleanup from camelCase to under_score * refactoring randIndex instances to rand_index * refactored `silhouetteScore` instances to `silhouette_score` * refactoring all `adjustedRandIndex` and `adjustedrandindex` to `adjusted_rand_index` * adjusted_rand_index more fixes * refactored `klDivergence` instances to `kl_divergence` * refactoring `mutualInfoScore` instances to `mutual_info_score` * refactoring `homogeneityScore` instances to `homogeneity_score` * refactoring `completenessScore` instances to `completeness_score` * refactoring `vMeasure` instances to `v_measure` * refactoring `pairwiseDistance` and related instances to `pairwise_distance` * preserving camelcase in relevant places * rand_index refactoring further nooks and corners * updating CHANGELOG.md * FIX clang-format fixes * flake8 fix * adding related changes from PR #3072 that affected this PR * resolving function name conflicts in the cython layer * adding a `cython_` prefix to cython headers wherever conflicted * updating appropriately in `__init__.py` files

* ENH speed test_array * DOC Added entry to changelog

* Speedup umap MNMG tests by lowering data sizes and removing parameters to test * Reomving accidental change * Updating changelog Co-authored-by: Dante Gama Dessavre <[email protected]>

…not fit with probability=True [skip-ci] (#3114) * Fixed typo in AttributeError (line 464) with at the end of the second line, and probability at the beginning of the third line did not have a space between them. * Update CHANGELOG.md

* FIX Fix memset args for benchmark * DOC Update changelog

* Adding ability to build with --linetrace=1 to support cython codecov * Adding PR to CHANGELOG * Style cleanup * Converting BUILD_PYTHON_ARGS to be a argument in build.sh

* Update README * UPDATE changelog * Apply suggestions from code review Co-authored-by: Dante Gama Dessavre <[email protected]> Co-authored-by: Nanthini Balasubramanian <[email protected]> Co-authored-by: Dante Gama Dessavre <[email protected]>

* Return Python string from dump_as_json() of RF * Add changelog

* Patch and test for RF crash #3107 * Cleanups of RF regression fixes * Add failing tests to RF regression * Expand experimental backend testing and align pointers * Expand python RF regression test * Updates based on review feedback * Update changelog * Add classification tests * Review comments and style fixes for RF

* draft 1 of better test parameter specification * refactor using variadic macros; move fil enums to own namespace * changelog; fixed fil.pyx enum import * simpler FIL_TEST_PARAMS macro, remove the ::enums:: changes * leaner change * renamed struct responsible for non default FIL test parameters * style

…2956) * Change get_params and set_params to a property params * Update deprecated docstring * Update changelog * Fix style * Change ARIMA parameters into cuML arrays, write variant of llf to avoid unnecessary memory copies, rename setter/getter, override get_params and set_params with NotImplementedError * Mark get_param_names as not implemented instead of get_params and set_params * Cleanup PR, remove redundancy, more efficient pack/unpack * Fix Python style Co-authored-by: John Zedlewski <[email protected]>

#3134) * Improving the deprecation message formatting in pydocs * Adding PR to CHANGELOG

…ators [skip-ci] (#3040) * Adding additional checking for incorrect use cases. Added CumlArrayDescriptor * Cleaning up more use cases * Initial commit of CumlArrayDescriptor in PCA * Incrementally updating CumlArray uses * Adding some improvements to decorators to auto detect certain scenarios where a function returns CumlArray * Adding internals.func_utils to test wrapping all functions and checking output types * Commit before merging upstream * Updating native_bayes * Partial working state * Updating KMeans * Partial pass over all Base subclasses * Mostly complete pass of removing to_output * Completed cleanup of Base method removal * Cleaning up more to_output uses. Fixing test errors * Adding tartet_arg property and fixing tests that can use it * More cleanup and test fixing * Updating types derived from Base to properly use get_param_names and allow setting Base values in constructor * Fixing import order. Adding support for sparse arrays * Attempting to fix nearest neighbors * Removing commented code * Fixing failing tests * Fixing more tests * Adding PR to CHANGELOG and style fixes * Fixing missing import * Removing protocol interface for python 3.7 * Fixing ARIMA. Required including changes from PR#2956 * Fixing labelbinarizer and KNN failing tests * Removing "invalid syntax" so flake8 can run * Adding more wrappers to ARIMA so tests pass. * Committing CI change to allow tests to run. * Moving memory check to plugin * Adding ability to load SPD environment variables to the logger * Changing pytest import-mode to better support development * Changing relative imports to absolute * Adding first iteration of dev guide to see how it looks * Improving the quick_run plugin * Removing skip_* from cuml decorators * Fixing cuml_decorators test. * Removing the logger environment addition * Updating non-Base methods to use decorators * Large cleanup of remaining to_output, with_cupy_rmm and input_to_dev_ptr * Style cleanup * Apply John's suggestions from code review on Dev Guide Co-authored-by: John Zedlewski <[email protected]> * Large update to Estimator Guide incorporating feedback from JohnZ * Removing array tracking and putting in plugin * Removing PR Description file * Removing ArrayOutputable * Removing test plugins * Cleaning up code to remove unnecessary diffs * Style cleanup * Defaulting to cp array instead of np, per feedback * Adding additional tests * Separating func_tools into separate files * Removing extra changes to conftest.py which should not have been committed. * Renaming base.py back to base.pyx * Apply suggestions from code review Co-authored-by: Dante Gama Dessavre <[email protected]> * Incorporating feedback from Dante's code review * Removing straggling TODO * Applying Dante's Revisions to ESTIMATOR_GUIDE Co-authored-by: Dante Gama Dessavre <[email protected]> * Updateing ESTIMATOR_GUIDE from feedback from Dante * Cleaning up straggling to_output * Another iteration on code review feedback * Style cleanup * More small items from code review * One final change to ESTIMATOR_GUIDE * Updaing all *_mg.pyx files to use the new naming conventions and CumlArrayDescriptor Co-authored-by: John Zedlewski <[email protected]> Co-authored-by: Dante Gama Dessavre <[email protected]>

* Update all DistanceType references * Style fix * Update changelog

…#3069) * Maintain dataframe output for single-series frames * Add unit test for single-series input type check * Update changelog * Add test for Series to DataFrame preprocessing * Handle output from preprocessors increasing dims * Allow norms to be returned as Series

* Fix Stochastic Gradient Descent Example The example that is currently in the docs does not run. dtype, penalty, lrate, loss are not defined. This new version sets the default values for the parameters of cumlSGD, and copies Mini Batch SGD Regression's dtype for pred_data['col1'], pred_data['col2']. When running this example, I also got slightly different values for the output, so these were also updated. * Added PR #3136 to 0.17 Bug Fixes

venkywonka and others added 22 commits November 2, 2020 11:11

Splitting ml metrics to individual files (#3033)

983c6f8

* splitting `cpp/src/metrics.cu` into seperately compiled files * updated CHANGELOG.md * file-naming cleanup from camelCase to under_score * addings related changes from PR #3072 that affected this PR

[REVIEW] Speeding up MNMG KNN Cl&Re testing (#3052)

5b7757a

* Speeding up MNMG KNN Cl&Re testing * Update changelog * Testing with extreme values

Fix artifacts in t-SNE results (#3084)

898f480

Fixes #3057 Co-authored-by: Corey J. Nolet <[email protected]>

[REVIEW] Improve test_kmeans runtime (#3077)

913a81d

* Use single random seed in kmeans tests * Prune redundant kmeans parameterization tests * Update changelog * Add extra k-means|| test Co-authored-by: Dante Gama Dessavre <[email protected]>

[REVIEW] speed test_array (#3112)

63fc249

* ENH speed test_array * DOC Added entry to changelog

[REVIEW] Speed up UMAP mnmg tests (#3115)

d9f86bc

* Speedup umap MNMG tests by lowering data sizes and removing parameters to test * Reomving accidental change * Updating changelog Co-authored-by: Dante Gama Dessavre <[email protected]>

[Review] Fix memset args for benchmark (#3119)

93eef64

* FIX Fix memset args for benchmark * DOC Update changelog

[REVIEW] Adding Cython to Code Coverage (#3111)

25b29f5

* Adding ability to build with --linetrace=1 to support cython codecov * Adding PR to CHANGELOG * Style cleanup * Converting BUILD_PYTHON_ARGS to be a argument in build.sh

Return Python string from dump_as_json() of RF (#3130)

d9a73a7

* Return Python string from dump_as_json() of RF * Add changelog

Avoid useless copy when exchanging flag buffers in CSR WeakCC (#3096)

708ae47

[REVIEW] Improving the Deprecation Message Formatting in Documentation (

f1cca8d

#3134) * Improving the deprecation message formatting in pydocs * Adding PR to CHANGELOG

[REVIEW] Move DistanceType enum to RAFT (#3141)

33069cf

* Update all DistanceType references * Style fix * Update changelog

daxiongshu merged commit e6d8ec3 into daxiongshu:branch-0.17 Nov 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sync with upstream #19

sync with upstream #19

daxiongshu commented Nov 17, 2020

sync with upstream #19

sync with upstream #19

Conversation

daxiongshu commented Nov 17, 2020