-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DAPHNE-191] Refactor CUDA buffer mgmt (aka introducing Object Meta D… #334
Conversation
…ata) This change introduces a major change how external storage buffers (CUDA memory specifically) are handled. In that regard, the following noteworthy changes are implemented: * Factor out CUDA allocations from DenseMatrix (one of the initial motivations of issue #191) * Introduce a mechanism to handle several storage backends and track ranges of a Structure's data. * To make use of the mechanism, an AllocationDescriptor is passed to create() and getValues() (at the moment only DenseMatrix is supported). * Allocation descriptors need to implement the IAllocationDescriptor interface. This decouples backend specific dependencies. * AllocationDescriptorHost and AllocationDescriptorCUDA are implemented atm. The former is more or less a no-op for now. * The CUDA memory allocation and data movement is moved to the CUDAContext class. It keeps track of its allocations per device. For now this does nothing but can be used to reuse allocations in the future. Closes #191, Closes #334
05f4429
to
b2f18f6
Compare
Thank you @corepointer for this great contribution! One thing though, on my machine when I run the following script: I noticed that you comment out the initalization of the result matrix on the |
Thank you, @aristotelis96, for trying out these code changes and finding the bugs and pointing me right to them :D Obviously I did not clean up the code properly and code, that I commented for testing was not put in place again. I also found a few other issues. While cleaning it up and testing it. New version coming soon ©️ |
Thanks @corepointer for fixing those issues. I found a new bug with this PR (haven't tested which commit might have introduced it though). This bug applies only for the vectorized engine (
|
…ata) This change introduces a major change how external storage buffers (CUDA memory specifically) are handled. In that regard, the following noteworthy changes are implemented: * Factor out CUDA allocations from DenseMatrix (one of the initial motivations of issue #191) * Introduce a mechanism to handle several storage backends and track ranges of a Structure's data. * To make use of the mechanism, an AllocationDescriptor is passed to create() and getValues() (at the moment only DenseMatrix is supported). * Allocation descriptors need to implement the IAllocationDescriptor interface. This decouples backend specific dependencies. * AllocationDescriptorHost and AllocationDescriptorCUDA are implemented atm. The former is more or less a no-op for now. * The CUDA memory allocation and data movement is moved to the CUDAContext class. It keeps track of its allocations per device. For now this does nothing but can be used to reuse allocations in the future. Closes #191, Closes #334
a7228c0
to
d2d95a5
Compare
I fixed the transpose.Thanks for spotting this. The issue should have been there before the introduction of the object meta data feature 🤔 |
…ata) This change introduces a major change how external storage buffers (CUDA memory specifically) are handled. In that regard, the following noteworthy changes are implemented: * Factor out CUDA allocations from DenseMatrix (one of the initial motivations of issue daphne-eu#191) * Introduce a mechanism to handle several storage backends and track ranges of a Structure's data. * To make use of the mechanism, an AllocationDescriptor is passed to create() and getValues() (at the moment only DenseMatrix is supported). * Allocation descriptors need to implement the IAllocationDescriptor interface. This decouples backend specific dependencies. * AllocationDescriptorHost and AllocationDescriptorCUDA are implemented atm. The former is more or less a no-op for now. * The CUDA memory allocation and data movement is moved to the CUDAContext class. It keeps track of its allocations per device. For now this does nothing but can be used to reuse allocations in the future. Closes daphne-eu#191, Closes daphne-eu#334
d2d95a5
to
2e483bf
Compare
2e483bf
to
534fbc9
Compare
…ata) This change introduces a major change how external storage buffers (CUDA memory specifically) are handled. In that regard, the following noteworthy changes are implemented: * Factor out CUDA allocations from DenseMatrix (one of the initial motivations of issue #191) * Introduce a mechanism to handle several storage backends and track ranges of a Structure's data. * To make use of the mechanism, an AllocationDescriptor is passed to create() and getValues() (at the moment only DenseMatrix is supported). * Allocation descriptors need to implement the IAllocationDescriptor interface. This decouples backend specific dependencies. * AllocationDescriptorHost and AllocationDescriptorCUDA are implemented atm. The former is more or less a no-op for now. * The CUDA memory allocation and data movement is moved to the CUDAContext class. It keeps track of its allocations per device. For now this does nothing but can be used to reuse allocations in the future. Closes #191, Closes #334
The code contained in LoadPartitioning.h does not need to be included all over the place (through inclusion in DaphneUserConfig.h)
534fbc9
to
31a0852
Compare
…ata)
This change introduces a major change how external storage buffers (CUDA memory specifically) are handled. In that regard, the following noteworthy changes are implemented: