-
Notifications
You must be signed in to change notification settings - Fork 532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Knn Imputer Class and dependency functionalities #4820
base: branch-23.02
Are you sure you want to change the base?
Knn Imputer Class and dependency functionalities #4820
Conversation
Fix forward merge rapidsai#4357 [skip-ci]
Implementing LinearSVM using the existing QN solvers. Authors: - Artem M. Chirkin (https://github.com/achirkin) Approvers: - Tamas Bela Feher (https://github.com/tfeher) - Robert Maynard (https://github.com/robertmaynard) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4268
[gpuCI] Forward-merge branch-21.12 to branch-22.02 [skip gpuci]
Authors: - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - William Hicks (https://github.com/wphicks) URL: rapidsai#4293
[gpuCI] Forward-merge branch-21.12 to branch-22.02 [skip gpuci]
Closes rapidsai#3846 Adds support for exogenous variables to ARIMA. All series in the batch must have the same number of exogenous variables, and exogenous variables are not shared across the batch (`exog` therefore has `n_exog * batch_size` columns). Example: ```python model = ARIMA(endog=df_endog, exog=df_exog_past, order=(1,0,1), seasonal_order=(1,1,1,12), fit_intercept=True, simple_differencing=False) model.fit() fc, lower, upper = model.forecast(40, exog=df_exog_future, level=0.95) ``` ![2021-09-22_exog_fc](https://user-images.githubusercontent.com/17441062/134339807-f815a7a3-98dc-49e5-8599-9607e660597a.png) Authors: - Louis Sugy (https://github.com/Nyrio) - Tamas Bela Feher (https://github.com/tfeher) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) - Tamas Bela Feher (https://github.com/tfeher) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4221
[gpuCI] Forward-merge branch-21.12 to branch-22.02 [skip gpuci]
Addresses rapidsai#4110 This is an experimental prototype. For now, it supports: * XGBoost models with numerical splits * cuML RF regressors with numerical splits cuML RF classifiers are not supported. Authors: - Philip Hyunsu Cho (https://github.com/hcho3) Approvers: - Rory Mitchell (https://github.com/RAMitchell) - William Hicks (https://github.com/wphicks) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4351
[gpuCI] Forward-merge branch-21.12 to branch-22.02 [skip gpuci]
Closes rapidsai#3805 Authors: - Micka (https://github.com/lowener) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#4361
[gpuCI] Forward-merge branch-21.12 to branch-22.02 [skip gpuci]
This upgrade is required to be in-line with: rapidsai/cudf#9716 Depends on: rapidsai/integration#390 Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) - Ray Douglass (https://github.com/raydouglass) URL: rapidsai#4372
[gpuCI] Forward-merge branch-21.12 to branch-22.02 [skip gpuci]
Fix Changelog Merge Conflicts for `branch-21.12`
[gpuCI] Forward-merge branch-21.12 to branch-22.02 [skip gpuci]
Changes to be in-line with: rapidsai/cudf#9734 Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) - AJ Schmidt (https://github.com/ajschmidt8) URL: rapidsai#4390
[gpuCI] Forward-merge branch-21.12 to branch-22.02 [skip gpuci]
cc @robertmaynard @quasiben @raydouglass Authors: - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) URL: rapidsai#4392
[gpuCI] Forward-merge branch-21.12 to branch-22.02 [skip gpuci]
Authors: - Philip Hyunsu Cho (https://github.com/hcho3) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4398
[gpuCI] Forward-merge branch-21.12 to branch-22.02 [skip gpuci]
…idsai#4400) PR uses project flash to build the cuML Python package mirroring what the C++ flow looks like. Note: Currently only changed for the CUDA 11.0 GPU test since that one uses Python 3.7, to do the other jobs we need to build the python package twice on the CPU job.
[gpuCI] Forward-merge branch-21.12 to branch-22.02 [skip gpuci]
Authors: - Peter Andreas Entschev (https://github.com/pentschev) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4396
…#4382) Suggest using LinearSVM when the user chooses to use the linear kernel in SVM. The reason is that LinearSVM uses a specialized faster solver. Closes rapidsai#1664 Also partially addresses rapidsai#2857 Authors: - Artem M. Chirkin (https://github.com/achirkin) Approvers: - Tamas Bela Feher (https://github.com/tfeher) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4382
Authors: - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4373
…ai#4405) There were actuall 2 minor issues that prevented `UMAPAlgo::Optimize::find_params_ab()` from being ASAN-clean at the moment: - One is the mem leaks, of course - Another one is the `malloc()`-`delete` mismatch -- only memory allocated using `new` or equivalent should be freed with operator `delete` or `delete[]` Another issue that was also addressed here: exception safety (i.e., by using `make_unique` from C++-14) Signed-off-by: Yitao Li <[email protected]> Authors: - Yitao Li (https://github.com/yitao-li) Approvers: - Zach Bjornson (https://github.com/zbjornson) - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#4405
P_sum is equal to n. See rapidsai#2622 where I made this change once before. rapidsai#4208 changed it back while consolidating code. Authors: - Zach Bjornson (https://github.com/zbjornson) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#4425
rerun tests |
Pass `NVTX` option to raft in a more similar way to the other arguments and make sure `RAFT_NVTX` option in the installed `raft-config.cmake`. Authors: - Artem M. Chirkin (https://github.com/achirkin) Approvers: - Corey J. Nolet (https://github.com/cjnolet) - Robert Maynard (https://github.com/robertmaynard) URL: rapidsai#4825
The conda recipe was updated to UCX 1.13.0 in rapidsai#4809 , but updating conda environment files was missing there. Authors: - Peter Andreas Entschev (https://github.com/pentschev) Approvers: - Jordan Jacobelli (https://github.com/Ethyling) URL: rapidsai#4813
Allows cuML to be installed with CuPy 11. xref: rapidsai/integration#508 Authors: - https://github.com/jakirkham Approvers: - Sevag H (https://github.com/sevagh) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4837
rerun tests |
1 similar comment
rerun tests |
Codecov ReportBase: 77.62% // Head: 78.24% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## branch-22.10 #4820 +/- ##
================================================
+ Coverage 77.62% 78.24% +0.61%
================================================
Files 180 181 +1
Lines 11384 11610 +226
================================================
+ Hits 8837 9084 +247
+ Misses 2547 2526 -21
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
This PR has been labeled |
3bc1de0
to
e7fd6cc
Compare
Merge PR : #4797 before merging this one. The functionalities required for this are in #4797
Created a draft PR and Added KNN Imputer class and dependency functionalities for imputation of missing values.
Supported Inputs: Numpy arrays, Pandas DataFrame, Cupy arrays, Cudf DataFrame
Tested on: Tesla T4 Single GPU
Time Latency:
Tested on numpy arrays with 25% of the data is masked, averaged the distance metric and set the column size to 100.
Data Points Cuml Sklearn
100000 0.513s 0.383s
1M 10.5s 36.1s
10M 105s 373s
Tested on numpy arrays with 1% of the data is masked, averaged the distance metric and set the column size to 100.
Data Points Cuml Sklearn
100000 0.217s 0.208s
1M 2.86s 7.73s
10M 10.2s 122s
Profiling on 1 million records:
Cupy in built functionalities are costing more time.