-
Notifications
You must be signed in to change notification settings - Fork 615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU MultiPaste #2681
GPU MultiPaste #2681
Conversation
Signed-off-by: Piotr Kowalewski <[email protected]>
Signed-off-by: Piotr Kowalewski <[email protected]>
Signed-off-by: Piotr Kowalewski <[email protected]>
…d few TODO items Signed-off-by: Piotr Kowalewski <[email protected]>
Signed-off-by: Piotr Kowalewski <[email protected]>
… output_size. Signed-off-by: Piotr Kowalewski <[email protected]>
Signed-off-by: Piotr Kowalewski <[email protected]>
…tes. Signed-off-by: Piotr Kowalewski <[email protected]>
Signed-off-by: Piotr Kowalewski <[email protected]>
Signed-off-by: Piotr Kowalewski <[email protected]>
Signed-off-by: Piotr Kowalewski <[email protected]>
Signed-off-by: Piotr Kowalewski <[email protected]>
Signed-off-by: Piotr Kowalewski <[email protected]>
…emoved intersection checking bug and memory alloc bug. Signed-off-by: Piotr Kowalewski <[email protected]>
Signed-off-by: Piotr Kowalewski <[email protected]>
Signed-off-by: Piotr Kowalewski <[email protected]>
Signed-off-by: Piotr Kowalewski <[email protected]>
Signed-off-by: Piotr Kowalewski <[email protected]>
Signed-off-by: Piotr Kowalewski <[email protected]>
Signed-off-by: Piotr Kowalewski <[email protected]>
Signed-off-by: Piotr Kowalewski <[email protected]>
Signed-off-by: Piotr Kowalewski <[email protected]>
Signed-off-by: Piotr Kowalewski <[email protected]>
Signed-off-by: Piotr Kowalewski <[email protected]>
The GPU operator seems to be working now and passes the tests most of time. Sometimes though, with a roughly 20% chance, there is a bug that produces random non-filled regions in the output: I made non-pasted regions be gray, not black. That would produce the following output, that has a gray background, but still has black glitched regions: Finally, I tried removing copying of the image completely, and instead, it would assign a gray shade depending on the grid cell. This should never output 0 to any pixel, also the values were not taken from any pointer. Yet the glitched black regions remained: Also, the operation did not crash, so each thread should have gone through the whole for loop, yet it seems some of them sometimes do not. |
Signed-off-by: Piotr Kowalewski <[email protected]>
const InListGPU<InputType, ndims> &in, | ||
span<paste::MultiPasteSampleInput<ndims - 1>> samples | ||
) { | ||
assert(ndims == 3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does it work with rectangles that are out of output bounds?
For example:
output is 640x480
input starts at 600, 400 and is specified as 100x100.
We should either clip (preferable) or throw, but I don't see it happening here.
If we clip, then we should completely reject boxes that are totally outside the output area.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, I added throwing exceptions from the operator if some box is outside in/out bounds.
@@ -232,7 +237,8 @@ def test_operator_multipaste(): | |||
[4, 2, (128, 128), (128, 128), False, False, False, False, False, types.UINT8], | |||
[4, 2, (128, 128), (128, 128), False, False, False, False, False, types.INT16], | |||
[4, 2, (128, 128), (128, 128), False, False, False, False, False, types.INT32], | |||
[4, 2, (128, 128), (128, 128), False, False, False, False, False, types.FLOAT] | |||
[4, 2, (128, 128), (128, 128), False, False, False, False, False, types.FLOAT], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have tests with out-of-bounds/clipping?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added those tests now
if use_gpu: | ||
kwargs["device"] = "gpu" | ||
|
||
pasted = fn.multi_paste(resized.gpu() if use_gpu else resized, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if use_gpu: | |
kwargs["device"] = "gpu" | |
pasted = fn.multi_paste(resized.gpu() if use_gpu else resized, **kwargs) | |
pasted = fn.multi_paste(resized.gpu() if use_gpu else resized) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will remove kwargs["device"], but there are other kwargs that still have to be passed
Signed-off-by: Piotr Kowalewski <[email protected]>
show_images(bs, r) | ||
manual_verify(bs, input, r, in_idx_l, in_anchors_l, shapes_l, out_anchors_l, [out_size + (3,)] * bs, out_dtype) | ||
|
||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nose has assert_raises
for this kind of test. Maybe it would be good to split it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the assert_raises
method, but it only asserts the type (RuntimeError
) of the exception, and any DALI_FAIL
results in this type
!build |
CI MESSAGE: [2150773]: BUILD STARTED |
CI MESSAGE: [2150773]: BUILD FAILED |
Clang complains about:
@TheTimemaster - can you check it? |
You can try |
Signed-off-by: Piotr Kowalewski <[email protected]>
!build |
CI MESSAGE: [2157524]: BUILD STARTED |
CI MESSAGE: [2157524]: BUILD PASSED |
Why we need this PR?
What happened in this PR?
What solution was applied:
The workflow is, that images are divided into an unequal grid where each cell contains data from only one paste, that one which happened last over this area. This grid is counted in the operator. Then kernel just copies data, but before each pixel, it has to check if it should jump to the next input cell vertically or horizontally.
Affected modules and functionalities:
GPU Implementation for Multipaste operator and kernel.
Key points relevant for the review:
MultiPaste GPU kernel + multipaste.cu operator
Validation and testing:
For the grid sweeping algorithm, it was implemented and tested outside dali: https://pastebin.com/0QSnu3Bw. Now automatic tests are added that run the same cases but on GPU.
Documentation (including examples):
No new documentation added
JIRA TASK: NA