-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TEST][FLAKY] tests/python/frontend/tensorflow/test_forward.py::test_forward_combined_nms #8140
Comments
The same error happened at https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8126/8/pipeline @trevor-m This is |
I was informed that there was another failure at https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8179/6/pipeline/ which does use the new code path. I was able to get it fail after 970 trials. |
Thanks for letting me know, I guess we can look into the NMS code which is shared by both paths? |
I'm not sure if the two flaky ness are due to the same reason, in which case yes, we need to look at the core NMS loop. Are you sure the conversion logic in TF frontend for |
The reproduction I got locally was due to a tie in scores. The top 2 scores have identical scores but the order is swapped between TF / TVM:
@trevor-m I wouldn't call it a bug, since neither TF or ONNX specifies what the order should be when there is a tie. In particular, TVM uses stable sort, while it seems TF uses unstable sort for NMS. I confirmed this based on comparing input and output box coordinates. Probably we should change the test code to make sure there would be no ties in scores. |
normally we should construct test cases to ensure there are no ties |
I'm seeing flakiness in this test in about 1/3 of CI jobs, it's becoming a real problem to getting other PRs merged. Should we think about disabling this test until we can resolve the flakiness? |
I'm fine with disabling the test for now, sorry I haven't had a chance to look into the flakiness yet. |
Another failure from the https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8358/2/pipeline |
@masahi please followup to see if we can close this issue |
This can be closed in the sense that the flaky test is now disabled. But the underlying problem with combined NMS converter for |
That test failed on a whitespace change, which looks suspicious... https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8124/1/pipeline/
The text was updated successfully, but these errors were encountered: