-
Notifications
You must be signed in to change notification settings - Fork 615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework diplacement filter to sample-based approach #3311
Conversation
* Get rid of unnecessary DataDependantSetup * Introduce SetupImpl * Change pass-by-pointer to pass-by-reference * Rework GPU Op to process flattened blocks instead of whole images accessed via offset to one global pointer. * Instead of accessing underlying contiguous TL buffer we access individual samples in GPU Op. * Rework CPU Op to use HostWorkspace * Fix masking in CPU Op - access it as int instead of bool to be consistent with the default and GPU Op. * Fix masking in CPU Op - copy does not try to access stream in CPU workspace which causes error. * Add paths in water test to cover more code paths * prime-sized image to fall into the non-optimized kernel * mask support * both input types: uint8 and float32. Signed-off-by: Krzysztof Lecki <[email protected]>
!build |
CI MESSAGE: [2908674]: BUILD STARTED |
CI MESSAGE: [2908674]: BUILD FAILED |
Signed-off-by: Krzysztof Lecki <[email protected]>
!build |
CI MESSAGE: [2923073]: BUILD STARTED |
Signed-off-by: Krzysztof Lecki <[email protected]>
reinterpret_cast<const typename Displacement::Param *>(raw_params); | ||
displace.param = params[n]; | ||
__device__ __host__ inline void operator()(Displacement &displace, const void *raw_params) { | ||
const auto *const params = reinterpret_cast<const typename Displacement::Param *>(raw_params); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cast from void*
is a static_cast.
const auto *const params = reinterpret_cast<const typename Displacement::Param *>(raw_params); | |
const auto *const params = static_cast<const typename Displacement::Param *>(raw_params); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
const int H = sample.shape[0]; | ||
const int W = sample.shape[1]; | ||
const int C = sample.shape[2]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These could be fast_div
in sample.shape
....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't be trying to benchmark that.
const int c = out_idx % C; | ||
const int w = (out_idx / C) % W; | ||
const int h = (out_idx / W / C); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const int c = out_idx % C; | |
const int w = (out_idx / C) % W; | |
const int h = (out_idx / W / C); | |
int64_t idx = out_idx; | |
const int c = idx % C; | |
idx /= C; | |
const int w = idx % W; | |
idx /= W; | |
const int h = idx; |
or at least
const int c = out_idx % C; | |
const int w = (out_idx / C) % W; | |
const int h = (out_idx / W / C); | |
const int c = out_idx % C; | |
const int w = (out_idx / C) % W; | |
const int h = (out_idx / C / W); |
The way it was written, it prevented optimization of the last two divisions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpicks, mostly.
CI MESSAGE: [2923073]: BUILD FAILED |
flat_block_setup_(32), | ||
channel_block_setup_(32) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just put it in the member definition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No particular reason, but due to the intricacies of C++, I tried it and could not use flat_block_setup_ = {32};
only, FlatBlockSetup flat_block_setup_ = FlatBlockSetup(32);
. Not sure which one is nicer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I can flat_block_stup_{32}
.
@@ -208,7 +219,10 @@ class DisplacementFilter<GPUBackend, Displacement, | |||
explicit DisplacementFilter(const OpSpec &spec) : | |||
Operator(spec), | |||
displace_(spec), | |||
interp_type_(spec.GetArgument<DALIInterpType>("interp_type")) { | |||
interp_type_(spec.GetArgument<DALIInterpType>("interp_type")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Order of initialization is wrong - and this could be an assignment inside the constructor body - the type of the variable is trivial whereas the initialization expression is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the order.
Signed-off-by: Krzysztof Lecki <[email protected]>
Signed-off-by: Krzysztof Lecki <[email protected]>
!build |
CI MESSAGE: [2937973]: BUILD STARTED |
CI MESSAGE: [2937973]: BUILD PASSED |
Description
What happened in this PR
Refactoring:
of whole images accessed via offset to one global
batch pointer.
TL buffer we access individual samples in GPU Op.
samples.
Fixes:
of bool to be consistent with the default and GPU Op.
stream in CPU workspace which causes error.
Tests:
kernel
Signed-off-by: Krzysztof Lecki [email protected]
Additional information
Affected modules and functionalities:
Old displacement filter implementation, basis for Water, Sphere and Jitter.
Key points relevant for the review:
Checklist
Tests
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: N/A