-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add texture support from CuTextures.jl #209
Conversation
I wonder if it would be valuable to try and expose this through array abstractions like broadcast. Since those typically require matching sizes, I could imagine something like @ChrisRackauckas, you've expressed interest in this, what's your use case? |
Codecov Report
@@ Coverage Diff @@
## master #209 +/- ##
==========================================
- Coverage 80.53% 80.13% -0.41%
==========================================
Files 152 154 +2
Lines 9911 10174 +263
==========================================
+ Hits 7982 8153 +171
- Misses 1929 2021 +92
Continue to review full report at Codecov.
|
julia> a = CUDA.rand(2,2)
2×2 CuArray{Float32,2,Nothing}:
0.148875 0.386196
0.0170659 0.696161
julia> b = similar(a)
2×2 CuArray{Float32,2,Nothing}:
0.0 0.0
0.0 0.0
julia> b .= CuTexture(a)
2×2 CuArray{Float32,2,Nothing}:
0.148875 0.148875
0.0170659 0.0170659
julia> @device_code_llvm debuginfo=:none b .= CuTexture(a)
; PTX CompilerJob of kernel broadcast(CUDA.CuKernelContext, CuDeviceArray{Float32,2,CUDA.AS.Global}, Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64},Base.OneTo{Int64}},typeof(identity),Tuple{Base.Broadcast.Extruded{CuDeviceTexture{Float32,2,false},Tuple{Bool,Bool},Tuple{Int64,Int64}}}}) for sm_75
...
L152: ; preds = %L146, %L141
%31 = load i64, i64* %56, align 8
%32 = call [4 x float] @llvm.nvvm.tex.unified.2d.v4f32.s32(i64 %31, i32 %57, i32 %28)
%.fca.0.extract = extractvalue [4 x float] %32, 0
%33 = getelementptr inbounds { [2 x i64], i64 }, { [2 x i64], i64 }* %0, i64 0, i32 1
%34 = bitcast i64* %33 to float addrspace(1)**
%35 = load float addrspace(1)*, float addrspace(1)** %34, align 8
%36 = getelementptr float, float addrspace(1)* %35, i64 %27
store float %.fca.0.extract, float addrspace(1)* %36, align 4
br label %L33
...
} |
d3147a2
to
cf97309
Compare
6cc3a54
to
e8ec49b
Compare
…to adjust for 1-based indexing.
Requires users to reinterpret arrays if, e.g., working with ColorTypes.
Step towards implementing the AbstractArray interface.
CI is green, so let's merge this. API is not final yet, it requires explicit use of CuTexture/CuTextureArray objects (more or less like @cdsousa designed it, but integrated with the rest of the stack). I'd like to try making CuArray compatible with it, dispatching on the inner buffer object. |
Nice!!! The issue above, got fixed? I've started looking at it and was able to test in my original code, where it was ok, but then I had no more time to continue. |
Which issue? The tests are mostly the same as they were in your version. |
Wasn't broadcast assignment supposed to give the same as a copy? |
Ah, I didn't even see they mismatched! I was just happy it generated appropriate code. But it seems to work now:
Although the exact example doesn't anymore:
Anyway, CuTexture isn't really ready to be used with broadcast yet, but it's nice to see the essentials work already. |
I messed up and #206 got closed.