-
Notifications
You must be signed in to change notification settings - Fork 532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Support Naive Bayes variants #1666
Labels
Cython / Python
Cython or Python issue
Dask / cuml.dask
Issue/PR related to Python level dask or cuml.dask features.
feature request
New feature or request
Comments
cjnolet
added
feature request
New feature or request
? - Needs Triage
Need team to review and classify
labels
Feb 12, 2020
cjnolet
added
Cython / Python
Cython or Python issue
Dask / cuml.dask
Issue/PR related to Python level dask or cuml.dask features.
and removed
? - Needs Triage
Need team to review and classify
labels
Feb 12, 2020
rapids-bot bot
pushed a commit
that referenced
this issue
Jul 22, 2021
This is a continuation of PR #1763, to add Multinomial and Bernoulli NB variants. The Gaussian and Categorical variants will be added in a following PR. Also linking issue #1666 Authors: - Micka (https://github.com/lowener) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #4053
rapids-bot bot
pushed a commit
that referenced
this issue
Aug 9, 2021
This is a continuation of PR #1763 and #4053, to add Gaussian Naive Bayes. This is supposed to be merged after #4053 Here is a comparison of cuML and SKLearn performance on Gaussian NB. This is done using a synthetic dataset generated by make_regression. The GPU used is a RTX 8000, and the CPU is i9-10920X @ 3.50GHz ![gaussian](https://user-images.githubusercontent.com/9810050/126572439-8982faa8-5ad1-4bca-91ab-76704050bf33.png) Linking issue #1666 Authors: - Micka (https://github.com/lowener) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #4079
rapids-bot bot
pushed a commit
that referenced
this issue
Sep 8, 2021
This is a continuation of PR #1763, #4053, and #4079, to add Categorical Naive Bayes. This is supposed to be merged after #4079. Linking issue #1666. Authors: - Micka (https://github.com/lowener) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #4150
rapids-bot bot
pushed a commit
that referenced
this issue
Mar 7, 2022
Closes #1666. The implementation of this variant is straightforward and matches sklearn. Authors: - Micka (https://github.com/lowener) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #4595
vimarsh6739
pushed a commit
to vimarsh6739/cuml
that referenced
this issue
Oct 9, 2023
This is a continuation of PR rapidsai#1763, to add Multinomial and Bernoulli NB variants. The Gaussian and Categorical variants will be added in a following PR. Also linking issue rapidsai#1666 Authors: - Micka (https://github.com/lowener) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#4053
vimarsh6739
pushed a commit
to vimarsh6739/cuml
that referenced
this issue
Oct 9, 2023
This is a continuation of PR rapidsai#1763 and rapidsai#4053, to add Gaussian Naive Bayes. This is supposed to be merged after rapidsai#4053 Here is a comparison of cuML and SKLearn performance on Gaussian NB. This is done using a synthetic dataset generated by make_regression. The GPU used is a RTX 8000, and the CPU is i9-10920X @ 3.50GHz ![gaussian](https://user-images.githubusercontent.com/9810050/126572439-8982faa8-5ad1-4bca-91ab-76704050bf33.png) Linking issue rapidsai#1666 Authors: - Micka (https://github.com/lowener) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#4079
vimarsh6739
pushed a commit
to vimarsh6739/cuml
that referenced
this issue
Oct 9, 2023
This is a continuation of PR rapidsai#1763, rapidsai#4053, and rapidsai#4079, to add Categorical Naive Bayes. This is supposed to be merged after rapidsai#4079. Linking issue rapidsai#1666. Authors: - Micka (https://github.com/lowener) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#4150
vimarsh6739
pushed a commit
to vimarsh6739/cuml
that referenced
this issue
Oct 9, 2023
Closes rapidsai#1666. The implementation of this variant is straightforward and matches sklearn. Authors: - Micka (https://github.com/lowener) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#4595
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Cython / Python
Cython or Python issue
Dask / cuml.dask
Issue/PR related to Python level dask or cuml.dask features.
feature request
New feature or request
There are 4 different variants of Naive Bayes in Scikit-learn:
Between experimentation with CuPy
RawKernel
and abstracting it for type agnosticism, creating a new directory for CuPy/Python-based prims, and initial multinomial Naive Bayes implementation, the Naive Bayes PR (#1375) has become quite large and needs to be merged.The primary primitive in the multinomial Naive Bayes variant, a custom
RawKernel
that uses shared memory andatomicAdd
to count features for each class, also supports squaring the sums so that it can be used to extract a mean and variance for the Gaussian variant. The remaining variants of the algorithm should also be able to make use of this primitive.Given the infrastructure provided by #1375, adding these variants should be straightforward and moderate to trivial. The distributed variants should just be able to proxy to the single-GPU classes, combining the underlying parameters in the same way as the existing multinomial version.
The text was updated successfully, but these errors were encountered: