-
Notifications
You must be signed in to change notification settings - Fork 615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make async_pool immune to handle reuse. #5348
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is almost copied from cv-cuda.
@@ -549,7 +577,7 @@ class async_pool_resource : public async_memory_resource<Kind>, | |||
|
|||
detail::pooled_map<char *, padded_block, true> padded_; | |||
|
|||
std::unordered_map<cudaStream_t, PerStreamFreeBlocks> stream_free_; | |||
std::unordered_map<uint64_t, PerStreamFreeBlocks> stream_free_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The per-stream resources are keyed not with a handle, but with an ID hint.
CI MESSAGE: [13203053]: BUILD STARTED |
CI MESSAGE: [13203053]: BUILD PASSED |
"cuGetProcAddress": {}, | ||
"cuGetProcAddress_v2": {}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
depending on CUDA header versions, we need either the former or the latter (starting with CUDA 12.0).
dali/core/mm/stream_id_hint.cc
Outdated
return fn; | ||
} | ||
|
||
bool _hasPreciseHint() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: why the _ at the beginning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I don't know :)
Signed-off-by: Michal Zientkiewicz <[email protected]>
Signed-off-by: Michal Zientkiewicz <[email protected]>
9c08600
to
8a4fab1
Compare
CI MESSAGE: [13281756]: BUILD STARTED |
CI MESSAGE: [13281883]: BUILD STARTED |
Remove underscore from a function name. Signed-off-by: Michal Zientkiewicz <[email protected]>
c9828d8
to
ddaf0a7
Compare
CI MESSAGE: [13281980]: BUILD STARTED |
CI MESSAGE: [13281980]: BUILD PASSED |
Category:
Bug fix .... actually, previous behavior was accepted and documented, but nonetheless limiting and error prone.
Description:
Prior to this change, stream-ordered allocations had to use stream handles with care - specifically, it was illegal to delete a stream which still had a pending deallocation. This might have caused problems with streams over which we have no control.
This PR removes this limitation. Also, handling of per-thread default stream was broken.
Instead of using a stream handle, this PR tries to obtain a unique stream ID. If possible, we proceed as before, but with extra guarantees - we don't have to care about the stream being destroyed, because the ID is unique.
When a stream ID is not available (old drivers), this code uses a handle-derived non-unique ID and scans the per-stream free blocks linearly when returning to upstream. When getting a per-stream free block, we're not sure it really comes from the same stream. Therefore, if the block isn't ready yet, we record an event on the requesting stream, so that in the worst case, the stream would wait for the allocation to really happen.
Additional information:
The functionality borrows heavily on the implementation of stream id hints in cv-cuda.
Affected modules and functionalities:
Memory resources.
Key points relevant for the review:
N/A
Tests:
NOTE: It's impossible to trigger the fallback behavior manually (well... perhaps we could have a separate test target with some env vars which are only used for testing - I don't think it's worth the extra overhead). It triggers automatically on old (pre-cuda12) drivers.
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: N/A