Skip to content
This repository has been archived by the owner on Mar 12, 2021. It is now read-only.

Commit

Permalink
review
Browse files Browse the repository at this point in the history
  • Loading branch information
wongalvis14 authored Mar 26, 2020
1 parent 181c3ba commit eb28965
Showing 1 changed file with 2 additions and 3 deletions.
5 changes: 2 additions & 3 deletions src/mapreduce.jl
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ function partial_mapreduce_grid(f, op, A, R, neutral, Rreduce, Rother, gridDim_r

val = op(neutral, neutral)

# get a value that should be reduced
# reduce serially across chunks of input vector that don't fit in a block
ireduce = threadIdx_reduce + (blockIdx_reduce - 1) * blockDim_reduce
while ireduce <= length(Rreduce)
Ireduce = Rreduce[ireduce]
Expand Down Expand Up @@ -142,8 +142,7 @@ NVTX.@range function GPUArrays.mapreducedim!(f, op, R::CuArray{T}, A::AbstractAr
# be conservative about using shuffle instructions
shuffle = true
shuffle &= capability(device()) >= v"3.0"
shuffle &= T in (Int32, Int64, Float32, Float64, ComplexF32, ComplexF64)
# TODO: add support for Bool (CUDAnative.jl#420)
shuffle &= T in (Bool, Int32, Int64, Float32, Float64, ComplexF32, ComplexF64)

# iteration domain, split in two: one part covers the dimensions that should
# be reduced, and the other covers the rest. combining both covers all values.
Expand Down

0 comments on commit eb28965

Please sign in to comment.