Try to understand and fix test_shutdown_wait_last_export #4042
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
TL;DR The time assert in the test was wrong.
Looking at the original commit (c84ba9) of this test, the delay of the mocked backoff was 4 seconds fixed with 6 iterations. So the time until shutdown returned was around 24 seconds. The default timeout of the lock wait in shutdown did not hit before the retry of the exporter was over.
The line
self.assertGreaterEqual(now, (start_time + 30 / 1000))
seemed to be wrong from the beginning. It just ensured that the shutdown call took at least 30ms. I assume it was supposed to do something else. But to me it's not clear what exactly it should do.One explanation would be that it was intended to do
self.assertGreaterEqual(now, (start_time + 30 * 1000))
instead. But that would mean: Shutdown should need at least 30 seconds to pass, which does not make sense imho.Another option would be that it was intended to do
self.assertLessEqual(now, (start_time + 30 * 1000))
instead. That would mean: Shutdown should need at maximum 30 seconds to pass, which actually could make more sense. It would ensure that the call ofself.exporter.shutdown
returns after 30 seconds.My biggest issue here is, that it tests the happy path where the retry is over before the timeout while there is no mechanism to ensure that the exporter thread is stopped in a sensible time in case it's current exponential backoff is in a longer sleep already while the shutdown flag is set. If I understand it correctly, it can hit the beginning of a 32 second sleep, which could mean that a process needs a) 30 seconds to signal the shutdown and additional 32 seconds to finish the exporter thread. But that's something for another change.
After all these words, looking at the current HEAD, the situation of the test case is basically the same as originally. The retry interval has been changed recently to be way shorter. The default lock timeout stayed the same at 30 seconds, but that does not matter too much since the retry mechanism is finished long before that. And the aforementioned assert is still wrong, with some additional problem from #4014.
Fixes the issue mentioned here: #4014 (comment)
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Does This PR Require a Contrib Repo Change?
Checklist: