-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nvidia Xavier devices: exception thrown during kernel execution on device Xavier #349
Comments
Since you have these packages Also, did you not see additional output from the device? It should have printed something (if anything, a recommendation to run under This also probably has nothing to do with running on a Xavier. |
Sorry, that wasn't a verbatim output sequence. I just added the top 4 lines as a MWE, then pasted the error below. I'll re-run this on the device and report more |
Yeah, this is happening on a GeForce GTX 1650 system too, with JLL-based CUDA. I've extracted the code which serves as a MWE. By the way, this code block has been unchanged recently, except for a switch from function keepdetections(input::CuArray) # THREADS:BLOCKS CAN BE OPTIMIZED WITH BETTER KERNEL
rows, cols = size(input)
bools = CUDA.zeros(Int32, cols)
@cuda blocks=cols threads=rows kern_genbools(input, bools)
idxs = cumsum(bools)
n = count(bools)
output = CuArray{Float32, 2}(undef, rows, n)
@cuda blocks=cols threads=rows kern_keepdetections(input, output, bools, idxs)
return output
end
function kern_genbools(input::CuDeviceArray, output::CuDeviceArray)
col = (blockIdx().x-1) * blockDim().x + threadIdx().x
cols = gridDim().x
if col < cols && input[5, col] > Float32(0)
@inbounds output[col] = Int32(1)
end
return
end
@inline function kern_keepdetections(input::CuDeviceArray, output::CuDeviceArray,
bools::CuDeviceArray, idxs::CuDeviceArray)
col = blockIdx().x
row = threadIdx().x
if bools[col] == Int32(1)
idx = idxs[col]
@inbounds output[row, idx] = input[row, col]
end
return
end
Any idea why this has broken? |
Seems to be a problem with count
|
Fixed with this. But I'm not sure why this started erroring.
|
Yeah, it's probably this type error you were seeing:
|
Describe the bug
Using ObjectDetector.jl (uses CUDA via Flux) on v1.5.0 I hit this kernel issue NVidia Xavier devices
Note that this is one of the NVIDIA jetpack-based devices, that uses a modified CUDA
Manifest.toml
Expected behavior
A clear and concise description of what you expected to happen.
Version info
Details on Julia:
Details on CUDA:
Additional context
The text was updated successfully, but these errors were encountered: