This repository has been archived by the owner on Mar 12, 2021. It is now read-only.
-
-
Notifications
You must be signed in to change notification settings - Fork 83
Extend accumulate!
#68
Labels
Comments
Maybe as a higher-order |
I have a simple and naive version of |
Yes that would be nice! Is it based on the CUDAnative example? |
Merged
Happy to help work on this and contribute an efficient I'm particularly interested in an efficient non-destructive The scan example seems to overwrite the array. I guess to make a non-destructive version you would take the work-efficient and accumulate into a temporary array? Currently |
Fixed by #447 |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
CuArrays has
accumulate!
, but it's limited: does not support anything but vectors, does not support the init keyword, and is slow (should use the shmem/shfl optims from https://github.com/JuliaGPU/CUDAnative.jl/blob/master/examples/scan.jl)Old post:
@dpsanders and I just run into the situation where we wanted to do a cumsum on a CuArray.
CUDAnative has it as a example, but we should probably add the functionality to CuArrays https://github.com/JuliaGPU/CUDAnative.jl/blob/master/examples/scan.jl
The text was updated successfully, but these errors were encountered: