Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CachingFileSystem with http vomits on __del__ under fsspec 2022.10.0 #1078

Closed
jwodder opened this issue Oct 20, 2022 · 1 comment · Fixed by #1085
Closed

CachingFileSystem with http vomits on __del__ under fsspec 2022.10.0 #1078

jwodder opened this issue Oct 20, 2022 · 1 comment · Fixed by #1085

Comments

@jwodder
Copy link
Contributor

jwodder commented Oct 20, 2022

As of the release of fsspec 2022.10.0 yesterday, the following code:

import fsspec
from fsspec.implementations.cached import CachingFileSystem

fs = CachingFileSystem(
    fs=fsspec.filesystem("http"),
    cache_storage="cache01",
)

with fs.open("https://httpbin.org/uuid", "r", encoding="utf-8") as f:
    print(f.read())

outputs the following traceback on termination:

Traceback (most recent call last):
  File "/Users/jwodder/work/dev/tmp/fsspec/fsspec-cache01a.py", line 10, in <module>
    with fs.open("https://httpbin.org/uuid", "r", encoding="utf-8") as f:
  File "/Users/jwodder/.local/virtualenvwrapper/venvs/tmp-aaf3a4de95a14b1/lib/python3.10/site-packages/fsspec/implementations/cached.py", line 344, in <lambda>
    f.close = lambda: self.close_and_update(f, close)
  File "/Users/jwodder/.local/virtualenvwrapper/venvs/tmp-aaf3a4de95a14b1/lib/python3.10/site-packages/fsspec/implementations/cached.py", line 406, in <lambda>
    return lambda *args, **kw: getattr(type(self), item).__get__(self)(
  File "/Users/jwodder/.local/virtualenvwrapper/venvs/tmp-aaf3a4de95a14b1/lib/python3.10/site-packages/fsspec/implementations/cached.py", line 357, in close_and_update
    c = self.cached_files[-1][path]
KeyError: '/uuid'
Exception ignored in: <function AbstractBufferedFile.__del__ at 0x10ddf55a0>
Traceback (most recent call last):
  File "/Users/jwodder/.local/virtualenvwrapper/venvs/tmp-aaf3a4de95a14b1/lib/python3.10/site-packages/fsspec/spec.py", line 1751, in __del__
  File "/Users/jwodder/.local/virtualenvwrapper/venvs/tmp-aaf3a4de95a14b1/lib/python3.10/site-packages/fsspec/implementations/cached.py", line 344, in <lambda>
  File "/Users/jwodder/.local/virtualenvwrapper/venvs/tmp-aaf3a4de95a14b1/lib/python3.10/site-packages/fsspec/implementations/cached.py", line 406, in <lambda>
  File "/Users/jwodder/.local/virtualenvwrapper/venvs/tmp-aaf3a4de95a14b1/lib/python3.10/site-packages/fsspec/implementations/cached.py", line 357, in close_and_update
KeyError: '/uuid'

Inserting a print statement in WholeFileCacheFileSystem.close_and_update() reveals that the keys in self.cached_files[-1] are URLs (i.e., 'https://httpbin.org/uuid') rather than the /uuid paths that the code expects.

@martindurant
Copy link
Member

This appears required but not sufficient to fix the problem:

--- a/fsspec/implementations/http.py
+++ b/fsspec/implementations/http.py
@@ -133,7 +133,7 @@ class HTTPFileSystem(AsyncFileSystem):
     @classmethod
     def _strip_protocol(cls, path):
         """For HTTP, we always want to keep the full URL"""
-        return path
+        return str(path)

(because paths are now URL class, not simple string; this should maybe be fixed on the .path attribute of the file instance)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants