You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The CLEvaluator class holds and owns both a cl_program and cl_kernels. This is not ideal for performance. A cl_program represents some compiled OpenCL code that can be used to generate cl_kernels. We have found that on AMD graphics cards, and to a lesser extend on OSX, that compiling the OpenCL kernels used for OSD is quite slow. For this reason we want to compile the kernels once and re-use them.
There is a way to do this with OSD, and that is to re-use the CLEvaluator. That would be fine except we want to use the CLEvaluator in threaded code, and the cl_kernels are not thread safe! Each cl_kernel represents a function call, if we have multiple threads setting arguments into the same cl_kernel & running them then we'll have data races everywhere.
To work around this I'm using a very-slightly modified version of OSD. The modification I made was to change the CLEvaluator data members from private to protected. With that change I am able to derive from CLEvaluator and change it's behavior with respect to cl_program compilation. I added a cl_program cache which gets checked for an existing compiled version of the program before compiling it. This change is enough to resolve my performance issue.
My question is would it be possible to update the Pixar version of OSD to have protected data members rather than private data members? If so then I won't have to make any changes to OSD itself to work around the performance problem.
The text was updated successfully, but these errors were encountered:
Thanks so much for raising this @williamkrick -- there is definitely an opportunity here for a better refactoring. We'll add this to our list to consider for the next substantial release of OSD.
The CLEvaluator class holds and owns both a cl_program and cl_kernels. This is not ideal for performance. A cl_program represents some compiled OpenCL code that can be used to generate cl_kernels. We have found that on AMD graphics cards, and to a lesser extend on OSX, that compiling the OpenCL kernels used for OSD is quite slow. For this reason we want to compile the kernels once and re-use them.
There is a way to do this with OSD, and that is to re-use the CLEvaluator. That would be fine except we want to use the CLEvaluator in threaded code, and the cl_kernels are not thread safe! Each cl_kernel represents a function call, if we have multiple threads setting arguments into the same cl_kernel & running them then we'll have data races everywhere.
To work around this I'm using a very-slightly modified version of OSD. The modification I made was to change the CLEvaluator data members from private to protected. With that change I am able to derive from CLEvaluator and change it's behavior with respect to cl_program compilation. I added a cl_program cache which gets checked for an existing compiled version of the program before compiling it. This change is enough to resolve my performance issue.
My question is would it be possible to update the Pixar version of OSD to have protected data members rather than private data members? If so then I won't have to make any changes to OSD itself to work around the performance problem.
The text was updated successfully, but these errors were encountered: