-
-
Notifications
You must be signed in to change notification settings - Fork 718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AssertionError in Scheduler.restart: assert not self.tasks #7398
Comments
2 tasks
await c.close()
del c, first_batch
> await async_wait_for(lambda: len(s.tasks) == len(second_batch), 5)
distributed/tests/test_scheduler.py:413:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
predicate = <function test_queued_release_multiple_workers.<locals>.<lambda> at 0x7f9a50098940>, timeout = 5, fail_func = None, period = 0.05
async def async_wait_for(predicate, timeout, fail_func=None, period=0.05):
deadline = time() + timeout
while not predicate():
await asyncio.sleep(period)
if time() > deadline:
if fail_func is not None:
fail_func()
> pytest.fail(f"condition not reached until {timeout} seconds")
E Failed: condition not reached until 5 seconds This failure—tasks are not removed from the scheduler after their client disconnects—seems like the same thing as this issue. That makes me confident that #7402 would fix it, and that the test I added is a reproducer. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
https://github.com/coiled/coiled-runtime/actions/runs/3686850652/jobs/6239652435
What happened:
Scheduler.client_releases_keys
failed to transition all tasks toforgotten
;Scheduler.restart
consequently failed on line 5730:distributed/distributed/scheduler.py
Lines 5722 to 5730 in 7fb9c48
There is no cluster dump of test_tensordot_stress.
cluster dump of test_spill (taken between step 1 and 2 above):
s3://coiled-runtime-ci/test-scratch/cluster_dumps/spill-9ad15a09/stability.test_spill.py.test_spilling[False].msgpack.gz
CC @fjetter
The text was updated successfully, but these errors were encountered: