-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pull: freezes on large datasets #7414
Comments
EDIT: also getting hangs with no reported errors By looking at the logs, when this happens one of the threads is erroring out:
Apparently 520 is a "catch-all" cloudflare error (see this) |
Reducing the dataset size should make it possible to run the benchmarks again while we wait for iterative/dvc#7414 to fix #319
This also happens when calling
|
Small update: by forcing a timeout value (60) to fsspec's sync wrapper in diff --git a/fsspec/asyn.py b/fsspec/asyn.py
index 18cbeb0..a013cf0 100644
--- a/fsspec/asyn.py
+++ b/fsspec/asyn.py
@@ -29,7 +29,7 @@ async def _runner(event, coro, result, timeout=None):
event.set()
-def sync(loop, func, *args, timeout=None, **kwargs):
+def sync(loop, func, *args, timeout=60, **kwargs):
"""
Make loop run coroutine until it returns. Runs in other thread
""" one gets the following traceback when the hang happens:
|
We have been getting reports that the timeout on sock_read was raising timeout error even for chunked uploads, and sometimes even uploading zero-byte files. See: https://github.com/iterative/dvc/issues/8065 and iterative/dvc#8100. These kinds of logics don't belong here, and should be upstreamed (eg: RetryClient/ClientTimeout, etc). We added timeout in iterative/dvc#7460 because of freezes in iterative/dvc#7414. I think we can rollback this for now given that there are lots of report of failures/issues with this line, and if we get any new reports of hangs, we'll investigate it separately.
We have been getting reports that the timeout on sock_read was raising timeout error even for chunked uploads, and sometimes even uploading zero-byte files. See: https://github.com/iterative/dvc/issues/8065 and iterative/dvc#8100. These kinds of logics don't belong here, and should be upstreamed (eg: RetryClient/ClientTimeout, etc). We added timeout in iterative/dvc#7460 because of freezes in iterative/dvc#7414. I think we can rollback this for now given that there are lots of report of failures/issues with this line, and if we get any new reports of hangs, we'll investigate it separately.
simplify _prepare credentials, revert some of the changes related to iterative/dvc#7414
simplify _prepare credentials, revert some of the changes related to iterative/dvc#7414
When running
dvc pull
on large datasets, dvc sometimes hangs and never returns, here's an example of traceback (after interrupting twice withKeyboardInterrupt
)$ dvc doctor
To reproduce:
See also iterative/dvc-bench#319
The text was updated successfully, but these errors were encountered: