Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add boolean cases for CPU kernels #3059

Merged
merged 21 commits into from
Apr 20, 2024
Merged

Conversation

ManasviGoyal
Copy link
Collaborator

@ManasviGoyal ManasviGoyal commented Mar 21, 2024

  • Removed the following CPU Kernels and replaced their implementation with Numpy/Cupy -

    • awkward_unique
    • awkward_unique_copy
    • awkward_NumpyArray_fill
  • Added boolean cases for same kernels

  • Fixed and added some CUDA kernels

  • Tests!

@ManasviGoyal ManasviGoyal changed the title feat: use segmented_Scan for cuda kernels feat: use segmented_scan for cuda kernels Mar 21, 2024
@ManasviGoyal ManasviGoyal added the gpu Concerns the GPU implementation (backend = "cuda') label Mar 21, 2024
@ManasviGoyal ManasviGoyal changed the title feat: use segmented_scan for cuda kernels feat: add cuda kernels Apr 12, 2024
@ManasviGoyal ManasviGoyal changed the title feat: add cuda kernels feat: add boolean cases for CPU kernels Apr 15, 2024
@ManasviGoyal
Copy link
Collaborator Author

ManasviGoyal commented Apr 16, 2024

@jpivarski I had removed awkward_unique and awkward_unique_copy and replaced it with nplike.unique_values which makes the CUDA implementation simple too. But I see that it is failing for ubuntu minimal test as the equal_nan argument requires numpy >= 1.19.0. Should I go back to using the kernels and add a separate case for the CUDA kernels or would it be possible to bump up the minimum version?

@jpivarski
Copy link
Member

As usual, replacing kernels with NumPy/CuPy equivalents is a good thing to do, and I don't want to return to the custom kernel now that this mostly works.

One error, with the BitMaskedArray, might be because the whole content is being checked for uniqueness, rather than just the part up to the BitMaskedArray's length.

For the missing equal_nan argument in old versions of NumPy, you can implement it in the nplike layer, similar to the way that byteorder has been added to packbits for old NumPy versions. In the case of equal_nan, you would just need to check isnan for floating point types and do a logical_and of equality and isnan. It's not easy to check NumPy to see whether an argument exists (other than by version), but I don't think it would be bad to use this custom logic on all floating point types.

Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I verify that the CUDA tests all pass. All that's left is to fix the old-NumPy case (np.unique not having an equal_nan argument), and then this will be ready to merge.

@jpivarski
Copy link
Member

This looks like it's done. If so, please merge it. Thanks!

@ManasviGoyal ManasviGoyal merged commit 8a701a5 into main Apr 20, 2024
41 checks passed
@ManasviGoyal ManasviGoyal deleted the ManasviGoyal/segmented_scan branch April 20, 2024 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gpu Concerns the GPU implementation (backend = "cuda')
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants