Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bugfix]Revert PR #2420 cuda/nms changes #2747

Merged
merged 1 commit into from
Mar 9, 2019
Merged

Conversation

Laurawly
Copy link
Contributor

@Laurawly Laurawly commented Mar 7, 2019

According to discussion forum: https://discuss.tvm.ai/t/mali-cl-out-of-resources-error-for-mxnet-ssd-models/1863. Found the bug caused by PR #2420's cuda/nms.py change. Reverting it back. @apivovarov

@Laurawly Laurawly requested review from masahi and vinx13 March 7, 2019 22:49
@apivovarov
Copy link
Contributor

Can you add unit tests for that fix?

@Laurawly
Copy link
Contributor Author

Laurawly commented Mar 7, 2019

Can you add unit tests for that fix?

nms unit tests already exits. The bug only happens for mali rk3399 and I don't think they have it in CI (correct me if I'm wrong.)

@apivovarov
Copy link
Contributor

apivovarov commented Mar 8, 2019

Can you add unit tests for that fix?

nms unit tests already exits. The bug only happens for mali rk3399 and I don't think they have it in CI (correct me if I'm wrong.)

Can we add unit test for sort_ir method which fails without this fix and works fine with this fix?

@vinx13
Copy link
Member

vinx13 commented Mar 8, 2019

@Laurawly do you have idea why this caused error?

@masahi masahi self-assigned this Mar 8, 2019
@Laurawly
Copy link
Contributor Author

Laurawly commented Mar 8, 2019

Can you add unit tests for that fix?

nms unit tests already exits. The bug only happens for mali rk3399 and I don't think they have it in CI (correct me if I'm wrong.)

Can we add unit test for sort_ir method which fails without this fix and works fine with this fix?

Sort_ir is included in nms op. It's not yet separated as a single op. Again this case will only fail on certain mali GPUs.

@Laurawly
Copy link
Contributor Author

Laurawly commented Mar 8, 2019

@Laurawly do you have idea why this caused error?

@vinx13 I'm not quite sure. It seems to me that the less vthread we use, the more workload each vthread has which might cause cl_out_of_resource error in this case. Also I tried adding storage sync for previous method but it still failed.

@masahi masahi merged commit b0a0ae4 into apache:master Mar 9, 2019
wweic pushed a commit to neo-ai/tvm that referenced this pull request Mar 9, 2019
wweic pushed a commit to neo-ai/tvm that referenced this pull request Mar 12, 2019
wweic pushed a commit to neo-ai/tvm that referenced this pull request Mar 12, 2019
@Laurawly Laurawly deleted the dev3 branch March 22, 2019 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants