Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for OpenCL floatN and doubleN types #18

Closed
vchuravy opened this issue Mar 4, 2014 · 3 comments
Closed

Support for OpenCL floatN and doubleN types #18

vchuravy opened this issue Mar 4, 2014 · 3 comments

Comments

@vchuravy
Copy link
Member

vchuravy commented Mar 4, 2014

It would be nice if OpenCL.jl would be able to support the halfN, floatN and doubleN types that OpenCL provides on Julias side.

Currently it is possible to initialize a double4 buffer with

    test_buff = cl.Buffer(Float64, ctx, :rw, Ndim * Mdim * 4)

But working with that data on Julias side is a hassle.

@jakebolewski
Copy link
Member

I agree that this would be nice to have. We could even emulate these types nicely with an array of immutable SIMD types that follow OpenCL's api, ex.

immutable Float4
    s0::Float32
    s1::Float32
    s2::Float32
    s3::Float32
end

However, it looks like Julia in the near future will get SIMD types so I'm hesitant to implement this now when we can take advantage of built in SIMD types in the future. Another caveat to this is buffer alignment. The OpenCL spec says that it is the user's job to make sure memory is aligned correctly when using host array pointers as memory buffers (use host pointer option for buffer construction). If we want to take advantage of SIMD types on CPU OpenCL platforms this needs to be correct. If I remember correctly Julia aligns all arrays to 16 byte boundaries. This would be correct for float4 opencl buffers but not for float8 which require 32 byte alignment. Right now there is no way of controlling alignment of Julia's arrays but this might change with the introduction of SIMD types to take advantage of SIMD align/store instructions which are often 2x faster than unaligned store/loads. If the contents are not aligned correctly I think it is up to the runtime to figure out what to do. It could use unaligned load/stores for SIMD types or copy the contents of the buffer to a new buffer with the correct alignment. This is less of a concern when using GPU's as unaligned buffers are copied anyway.

@nstiurca
Copy link
Contributor

One possible use case for OpenCL's float2/float4 is to represent complex numbers or quaternions. Likewise, uchar3/uchar4 often represent RGB/RGBA pixels. We should consider if there is a clean way to map between eg Julia's Complex <--> half2/float2/double2, or Julia's Quaternion <--> float4, or mapping to Colors.RGB, etc.

@nstiurca
Copy link
Contributor

Also, no reason to limit to float/double support. We should also support all integer types, as well as half (Float16) if the OpenCL device supports it.

@juliohm juliohm closed this as completed Oct 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants