[DAPHNE-191] Refactor CUDA buffer mgmt (aka introducing Object Meta D… #334

corepointer · 2022-04-06T12:19:39Z

…ata)

This change introduces a major change how external storage buffers (CUDA memory specifically) are handled. In that regard, the following noteworthy changes are implemented:

Factor out CUDA allocations from DenseMatrix (one of the initial motivations of issue Refactor CUDA buffer mgmt #191)
Introduce a mechanism to handle several storage backends and track ranges of a Structure's data.
To make use of the mechanism, an AllocationDescriptor is passed to create() and getValues() (at the moment only DenseMatrix is supported).
Allocation descriptors need to implement the IAllocationDescriptor interface. This decouples backend specific dependencies.
AllocationDescriptorHost and AllocationDescriptorCUDA are implemented atm. The former is more or less a no-op for now.
The CUDA memory allocation and data movement is moved to the CUDAContext class. It keeps track of its allocations per device. For now this does nothing but can be used to reuse allocations in the future.

…ata) This change introduces a major change how external storage buffers (CUDA memory specifically) are handled. In that regard, the following noteworthy changes are implemented: * Factor out CUDA allocations from DenseMatrix (one of the initial motivations of issue #191) * Introduce a mechanism to handle several storage backends and track ranges of a Structure's data. * To make use of the mechanism, an AllocationDescriptor is passed to create() and getValues() (at the moment only DenseMatrix is supported). * Allocation descriptors need to implement the IAllocationDescriptor interface. This decouples backend specific dependencies. * AllocationDescriptorHost and AllocationDescriptorCUDA are implemented atm. The former is more or less a no-op for now. * The CUDA memory allocation and data movement is moved to the CUDAContext class. It keeps track of its allocations per device. For now this does nothing but can be used to reuse allocations in the future. Closes #191, Closes #334

aristotelis96 · 2022-05-17T09:21:52Z

Thank you @corepointer for this great contribution! One thing though, on my machine when I run the following script:
a = rand(5,5,0.0,10.0,1,-1);
print(a+a);
on the local runtime: ./build/bin/daphne test.script I get a segmentation fault.

I noticed that you comment out the initalization of the result matrix on the EwBinaryMat.h kernel. Un-commenting that out fixes the error, however the error re-appears when running with vectorized execution (./build/bin/daphnec --vec test.script).

corepointer · 2022-05-18T10:52:51Z

Thank you, @aristotelis96, for trying out these code changes and finding the bugs and pointing me right to them :D Obviously I did not clean up the code properly and code, that I commented for testing was not put in place again. I also found a few other issues. While cleaning it up and testing it. New version coming soon ©️

aristotelis96 · 2022-05-21T08:41:17Z

Thanks @corepointer for fixing those issues. I found a new bug with this PR (haven't tested which commit might have introduced it though). This bug applies only for the vectorized engine (--vec). There seems to be something wrong with the transpose function. The following script is returning different results.
The result dimensions seem to be fine, however the resulted values of the matrix, are wrong.

a = rand(10,5,0.0,10.0,1,123);
print(t(a));

…ata) This change introduces a major change how external storage buffers (CUDA memory specifically) are handled. In that regard, the following noteworthy changes are implemented: * Factor out CUDA allocations from DenseMatrix (one of the initial motivations of issue #191) * Introduce a mechanism to handle several storage backends and track ranges of a Structure's data. * To make use of the mechanism, an AllocationDescriptor is passed to create() and getValues() (at the moment only DenseMatrix is supported). * Allocation descriptors need to implement the IAllocationDescriptor interface. This decouples backend specific dependencies. * AllocationDescriptorHost and AllocationDescriptorCUDA are implemented atm. The former is more or less a no-op for now. * The CUDA memory allocation and data movement is moved to the CUDAContext class. It keeps track of its allocations per device. For now this does nothing but can be used to reuse allocations in the future. Closes #191, Closes #334

corepointer · 2022-05-24T07:17:10Z

I fixed the transpose.Thanks for spotting this. The issue should have been there before the introduction of the object meta data feature 🤔
Sorry for force pushing - I rebased this branch to the latest in main. This means you will get conflicts pulling updates in this branch. You can work around it by dropping your local version of this branch. Either delete your local 191-metadata-allocations branch and check out again or if you're currently on that branch, do a git fetch and git reset --hard origin/191-metadata-allocations. In any case this will remove changes you might have made to this branch. Uncommitted changes can be stashed of course ;-)

…ata) This change introduces a major change how external storage buffers (CUDA memory specifically) are handled. In that regard, the following noteworthy changes are implemented: * Factor out CUDA allocations from DenseMatrix (one of the initial motivations of issue daphne-eu#191) * Introduce a mechanism to handle several storage backends and track ranges of a Structure's data. * To make use of the mechanism, an AllocationDescriptor is passed to create() and getValues() (at the moment only DenseMatrix is supported). * Allocation descriptors need to implement the IAllocationDescriptor interface. This decouples backend specific dependencies. * AllocationDescriptorHost and AllocationDescriptorCUDA are implemented atm. The former is more or less a no-op for now. * The CUDA memory allocation and data movement is moved to the CUDAContext class. It keeps track of its allocations per device. For now this does nothing but can be used to reuse allocations in the future. Closes daphne-eu#191, Closes daphne-eu#334

…ata) This change introduces a major change how external storage buffers (CUDA memory specifically) are handled. In that regard, the following noteworthy changes are implemented: * Factor out CUDA allocations from DenseMatrix (one of the initial motivations of issue #191) * Introduce a mechanism to handle several storage backends and track ranges of a Structure's data. * To make use of the mechanism, an AllocationDescriptor is passed to create() and getValues() (at the moment only DenseMatrix is supported). * Allocation descriptors need to implement the IAllocationDescriptor interface. This decouples backend specific dependencies. * AllocationDescriptorHost and AllocationDescriptorCUDA are implemented atm. The former is more or less a no-op for now. * The CUDA memory allocation and data movement is moved to the CUDAContext class. It keeps track of its allocations per device. For now this does nothing but can be used to reuse allocations in the future. Closes #191, Closes #334

The code contained in LoadPartitioning.h does not need to be included all over the place (through inclusion in DaphneUserConfig.h)

…t implementation

corepointer force-pushed the 191-metadata-allocations branch from 05f4429 to b2f18f6 Compare April 6, 2022 12:21

corepointer added the Accelerators label Apr 27, 2022

aristotelis96 mentioned this pull request Apr 28, 2022

Draft: Distributed Runtime update #330

Closed

aristotelis96 mentioned this pull request May 18, 2022

[Distributed] - Code Structure for the Distributed Runtime (Object Metadata) #367

Closed

corepointer force-pushed the 191-metadata-allocations branch from a7228c0 to d2d95a5 Compare May 24, 2022 06:57

corepointer force-pushed the 191-metadata-allocations branch from d2d95a5 to 2e483bf Compare August 21, 2022 14:08

corepointer force-pushed the 191-metadata-allocations branch from 2e483bf to 534fbc9 Compare August 25, 2022 16:27

corepointer added 4 commits August 25, 2022 22:26

[MINOR] Optimized imports

ad59236

The code contained in LoadPartitioning.h does not need to be included all over the place (through inclusion in DaphneUserConfig.h)

[BUGFIX] Restrict transpose optimization due to currently insufficien…

ecab1df

…t implementation

[MINOR] A debug print utility function for CUDA buffer content

31a0852

corepointer force-pushed the 191-metadata-allocations branch from 534fbc9 to 31a0852 Compare August 25, 2022 20:30

corepointer merged commit 31a0852 into main Aug 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DAPHNE-191] Refactor CUDA buffer mgmt (aka introducing Object Meta D… #334

[DAPHNE-191] Refactor CUDA buffer mgmt (aka introducing Object Meta D… #334

corepointer commented Apr 6, 2022 •

edited

Loading

aristotelis96 commented May 17, 2022 •

edited

Loading

corepointer commented May 18, 2022

aristotelis96 commented May 21, 2022 •

edited

Loading

corepointer commented May 24, 2022

[DAPHNE-191] Refactor CUDA buffer mgmt (aka introducing Object Meta D… #334

[DAPHNE-191] Refactor CUDA buffer mgmt (aka introducing Object Meta D… #334

Conversation

corepointer commented Apr 6, 2022 • edited Loading

aristotelis96 commented May 17, 2022 • edited Loading

corepointer commented May 18, 2022

aristotelis96 commented May 21, 2022 • edited Loading

corepointer commented May 24, 2022

corepointer commented Apr 6, 2022 •

edited

Loading

aristotelis96 commented May 17, 2022 •

edited

Loading

aristotelis96 commented May 21, 2022 •

edited

Loading