-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring MetaDataObject out of DenseMatrix #758
Open
corepointer
wants to merge
3
commits into
daphne-eu:main
Choose a base branch
from
corepointer:mdo_csr_cuda_refactor
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
corepointer
added a commit
to corepointer/daphne
that referenced
this pull request
Jul 22, 2024
* This commit introduces the meta data object to the CSR data type * Memory pinning To prevent excessive allocation ID lookups in the hot path when using --vec, this change "pins" memory by allocation type of previous accesses.
Draft
corepointer
added a commit
to corepointer/daphne
that referenced
this pull request
Jul 29, 2024
* This commit introduces the meta data object to the CSR data type * Memory pinning To prevent excessive allocation ID lookups in the hot path when using --vec, this change "pins" memory by allocation type of previous accesses.
corepointer
added a commit
to corepointer/daphne
that referenced
this pull request
Aug 19, 2024
* This commit introduces the meta data object to the CSR data type * Memory pinning To prevent excessive allocation ID lookups in the hot path when using --vec, this change "pins" memory by allocation type of previous accesses.
corepointer
added a commit
to corepointer/daphne
that referenced
this pull request
Oct 18, 2024
* This commit introduces the meta data object to the CSR data type * Memory pinning To prevent excessive allocation ID lookups in the hot path when using --vec, this change "pins" memory by allocation type of previous accesses.
corepointer
force-pushed
the
mdo_csr_cuda_refactor
branch
from
October 18, 2024 15:15
df4702e
to
cfd8053
Compare
corepointer
added a commit
to corepointer/daphne
that referenced
this pull request
Oct 18, 2024
… Pinning * This commit introduces the meta data object to the CSRMatrix data type To implement this change, handling of the AllocationDescriptors has been refactored out of DenseMatrix. * Separate handling of ranges Since tracking of ranges of data is only used in the distributed setting for now, we will handle this separately and assume always a full allocation for local computation. This should result in less unnecessary "if range not null do this, else do that". * Memory pinning To prevent excessive allocation ID lookups in the hot path, especially when using --vec, this change "pins" memory by allocation type of previous accesses. Simply put, as long as there is no different access type (e.g., call getValues() for host vs device memory) it is assumed, that the data is not changed and no query of the meta data object needs to be done. Closes daphne-eu#758
corepointer
force-pushed
the
mdo_csr_cuda_refactor
branch
from
October 18, 2024 15:31
cfd8053
to
17d3baa
Compare
corepointer
added a commit
to corepointer/daphne
that referenced
this pull request
Oct 18, 2024
… Pinning * This commit introduces the meta data object to the CSRMatrix data type To implement this change, handling of the AllocationDescriptors has been refactored out of DenseMatrix. * Separate handling of ranges Since tracking of ranges of data is only used in the distributed setting for now, we will handle this separately and assume always a full allocation for local computation. This should result in less unnecessary "if range not null do this, else do that". * Memory pinning To prevent excessive allocation ID lookups in the hot path, especially when using --vec, this change "pins" memory by allocation type of previous accesses. Simply put, as long as there is no different access type (e.g., call getValues() for host vs device memory) it is assumed, that the data is not changed and no query of the meta data object needs to be done. Closes daphne-eu#758
corepointer
force-pushed
the
mdo_csr_cuda_refactor
branch
from
October 18, 2024 17:04
17d3baa
to
d9d1b59
Compare
corepointer
added a commit
to corepointer/daphne
that referenced
this pull request
Oct 18, 2024
… Pinning * This commit introduces the meta data object to the CSRMatrix data type To implement this change, handling of the AllocationDescriptors has been refactored out of DenseMatrix. * Separate handling of ranges Since tracking of ranges of data is only used in the distributed setting for now, we will handle this separately and assume always a full allocation for local computation. This should result in less unnecessary "if range not null do this, else do that". * Memory pinning To prevent excessive allocation ID lookups in the hot path, especially when using --vec, this change "pins" memory by allocation type of previous accesses. Simply put, as long as there is no different access type (e.g., call getValues() for host vs device memory) it is assumed, that the data is not changed and no query of the meta data object needs to be done. Closes daphne-eu#758
corepointer
force-pushed
the
mdo_csr_cuda_refactor
branch
from
October 18, 2024 17:06
d9d1b59
to
9016ae9
Compare
corepointer
added a commit
to corepointer/daphne
that referenced
this pull request
Oct 18, 2024
… Pinning * This commit introduces the meta data object to the CSRMatrix data type To implement this change, handling of the AllocationDescriptors has been refactored out of DenseMatrix. * Separate handling of ranges Since tracking of ranges of data is only used in the distributed setting for now, we will handle this separately and assume always a full allocation for local computation. This should result in less unnecessary "if range not null do this, else do that". * Memory pinning To prevent excessive allocation ID lookups in the hot path, especially when using --vec, this change "pins" memory by allocation type of previous accesses. Simply put, as long as there is no different access type (e.g., call getValues() for host vs device memory) it is assumed, that the data is not changed and no query of the meta data object needs to be done. Closes daphne-eu#758
corepointer
force-pushed
the
mdo_csr_cuda_refactor
branch
from
October 18, 2024 17:10
9016ae9
to
6f6da3b
Compare
corepointer
added a commit
to corepointer/daphne
that referenced
this pull request
Oct 18, 2024
… Pinning * This commit introduces the meta data object to the CSRMatrix data type To implement this change, handling of the AllocationDescriptors has been refactored out of DenseMatrix. * Separate handling of ranges Since tracking of ranges of data is only used in the distributed setting for now, we will handle this separately and assume always a full allocation for local computation. This should result in less unnecessary "if range not null do this, else do that". * Memory pinning To prevent excessive allocation ID lookups in the hot path, especially when using --vec, this change "pins" memory by allocation type of previous accesses. Simply put, as long as there is no different access type (e.g., call getValues() for host vs device memory) it is assumed, that the data is not changed and no query of the meta data object needs to be done. Closes daphne-eu#758
corepointer
force-pushed
the
mdo_csr_cuda_refactor
branch
from
October 18, 2024 17:12
6f6da3b
to
ce36921
Compare
The numerous force pushes are a result of my local clang-format disagreeing with the CI's clang-format: --- src/runtime/local/datastructures/AllocationDescriptorGRPC.h (original)
+++ src/runtime/local/datastructures/AllocationDescriptorGRPC.h (reformatted)
@@ -35,7 +35,7 @@
public:
AllocationDescriptorGRPC() = default;
AllocationDescriptorGRPC(DaphneContext *ctx, const std::string &address, const DistributedData &data)
- : ctx(ctx), workerAddress(address), distributedData(data) {};
+ : ctx(ctx), workerAddress(address), distributedData(data){};
~AllocationDescriptorGRPC() override = default;
[[nodiscard]] ALLOCATION_TYPE getType() const override { return type; }; |
corepointer
added
feature
missing/requested features
performance
label for PRs of perf++ and issues of perf--
Accelerators
Distributed
Issues and PRs related to distributed computation
labels
Oct 18, 2024
Explaining the labels:
|
…not throw Changing the behavior of fileExists() to a boolean operation as suggested by the method's name. Throwing an exception us up to the caller of this method. Closes daphne-eu#867
… Pinning * This commit introduces the meta data object to the CSRMatrix data type To implement this change, handling of the AllocationDescriptors has been refactored out of DenseMatrix. * Separate handling of ranges Since tracking of ranges of data is only used in the distributed setting for now, we will handle this separately and assume always a full allocation for local computation. This should result in less unnecessary "if range not null do this, else do that". * Memory pinning To prevent excessive allocation ID lookups in the hot path, especially when using --vec, this change "pins" memory by allocation type of previous accesses. Simply put, as long as there is no different access type (e.g., call getValues() for host vs device memory) it is assumed, that the data is not changed and no query of the meta data object needs to be done. Closes daphne-eu#758
Due to the use of ptr to local var the distributed (GRPC_SYNC) mode crashed in test cases. This patch fixes this by using std::unique_ptr appropriately. Furthermore, a check for nullptr is performed before getting distributed data to add a message indicating that execution failed here.
corepointer
force-pushed
the
mdo_csr_cuda_refactor
branch
from
October 19, 2024 00:59
ce36921
to
d434bf5
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Accelerators
Distributed
Issues and PRs related to distributed computation
feature
missing/requested features
performance
label for PRs of perf++ and issues of perf--
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR moves the MetaDataObject (MDO) functionality out of DenseMatrix and generalizes it to be used by other classes derived from Structure as well.
Furthermore, this contains a performance improvement to prevent excessive allocation ID lookups and a separation of ranged and full allocations.
All tests are running except the distributed ones.