Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create fsstore from filesystem #911

Merged

Conversation

rabernat
Copy link
Contributor

@rabernat rabernat commented Dec 14, 2021

This PR adds a new argument to FSStore which allows one to pass an existing fsspec filesystem object from which to generate the FSStore. The need to initialize an FSStore from an existing filesystem came up in pangeo-forge/pangeo-forge-recipes#254. It seems like this is something people may want to do more generally.

Also fixes #993 by simply removing the check for list-ability when initializing an FSStore.

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/tutorial.rst
  • Changes documented in docs/release.rst
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

zarr/storage.py Show resolved Hide resolved
zarr/storage.py Show resolved Hide resolved
zarr/tests/test_storage.py Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Dec 14, 2021

Codecov Report

Merging #911 (c1c9255) into master (d1f590d) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #911   +/-   ##
=======================================
  Coverage   99.94%   99.94%           
=======================================
  Files          34       34           
  Lines       13719    13785   +66     
=======================================
+ Hits        13712    13778   +66     
  Misses          7        7           
Impacted Files Coverage Δ
zarr/storage.py 100.00% <100.00%> (ø)
zarr/tests/test_convenience.py 100.00% <100.00%> (ø)
zarr/tests/test_core.py 100.00% <100.00%> (ø)
zarr/tests/test_storage.py 100.00% <100.00%> (ø)

@rabernat
Copy link
Contributor Author

rabernat commented Jan 31, 2022

I would like to revive and move forward this PR. Martin, you gave a few helpful comments and ideas above. Would you mind turning these into a more formal review in terms of what you think is needed vs interesting possibility.

What we have here is a tightly scoped and tested PR to unblock a rather serious upstream problem. If there is no major technical debt incurred by this PR, could we consider merging as is (modulo doc updates from checklist) rather than expanding scope? If not, please let us know what else would be required.

Also pinging @grlee77 and @joshmoore for potential interaction with ongoing refactor of storage classes (#789).

@rabernat rabernat requested review from d-v-b and joshmoore and removed request for d-v-b January 31, 2022 18:45
@grlee77
Copy link
Contributor

grlee77 commented Jan 31, 2022

Also pinging @grlee77 and @joshmoore for potential interaction with ongoing refactor of storage classes (#789).

Thanks @rabernat, I commented above. If the __init__ signature does not change then this should be pretty easy to merge into the work there.

@joshmoore
Copy link
Member

Thanks, @grlee77. From my side, that was the only concern in terms of impact on existing work, @rabernat. @martindurant's point about the general API "burden" seems to be outstanding. Is it that a concrete alternative that doesn't break the API needs proposing from your side, @rabernat?

A side note that testing is currently blocked due to GHA weirdness. @jakirkham and I will try to get that sorted ASAP.

@joshmoore
Copy link
Member

@rabernat: would you prefer someone else to propose an alternative?

Comment on lines -1331 to -1312
if self.fs.exists(self.path) and not self.fs.isdir(self.path):
raise FSPathExistNotDir(url)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To fix #993 I have simply removed this check...

Comment on lines -1251 to -1254
filepath = os.path.join(path, self.root + "foo")
with pytest.raises(ValueError):
self.create_store(path=filepath, mode='r')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and its corresponding test.

This test did not do anything other than test the error above. self.create_store does not actually attempt to write any data or list the store.

With the check gone, users will now see an error from fsspec if they try to do an operation that requires listing on an unlistable FSSpec store, as they would already on a generic MutableMapping-based store.

@pep8speaks
Copy link

pep8speaks commented Apr 5, 2022

Hello @rabernat! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-05-04 11:10:02 UTC

Copy link
Member

@martindurant martindurant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very little to comment

zarr/storage.py Show resolved Hide resolved
zarr/tests/test_convenience.py Show resolved Hide resolved
zarr/tests/test_convenience.py Show resolved Hide resolved
@rabernat rabernat force-pushed the create-fsstore-from-filesystem branch from e76a376 to cfadc1d Compare April 7, 2022 14:46
Comment on lines +27 to +28
2.11.3
------
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although 2.11.3 has been released, I noted that the docs had not been updated, so I added it here.

It seems we have skipped release 2.11.2?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those releases are coming from the 2_11 branch. (See the related #898 (comment))

I assume you intend this for release as a 2.11.x?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I have clearly not been able to follow that discussion. After reading, I now understand the situation better. Let's discuss at today's SC meeting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please ping this thread with any decision on which branch to merge into

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martindurant : @rabernat seemed to think it was ok to hold off on @grlee77's upcoming v3 configuration fix, i.e. this can stay on the branch it is and be released as something >2.11.x. If it looks like we want it in a quick ~=2.11.x release, I can attempt the backport.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do whatever the other devs recommend here (either backport to 2.11.x or put this in 2.12.0)

I am confused about how we will maintain a consistent changelog for these two branches. So I am standing by for advice.

@rabernat
Copy link
Contributor Author

rabernat commented Apr 7, 2022

From my POV, this is now complete. Would appreciate any other feedback or otherwise a merge.

@martindurant
Copy link
Member

@joshmoore , I see you have been tagged to review. I am happy to put this in, if you don't intend to look.

@rabernat
Copy link
Contributor Author

rabernat commented Apr 7, 2022

This is my first Zarr PR in quite a while, so I just tagged the people I know to review. 😄

@rabernat rabernat force-pushed the create-fsstore-from-filesystem branch from 3bc0299 to 36c05ed Compare April 12, 2022 19:56
@rabernat
Copy link
Contributor Author

rabernat commented Apr 21, 2022

I feel confused about how to move forward with this PR in terms of how to update the changelog (see #911 (comment)). Please help me 🙏. I would love to see this merged soon as I believe it is finished. It is now starting to accumulate conflicts.

@joshmoore
Copy link
Member

Sorry for having left you hanging. I think either will be fine @rabernat. Either just add your item to unreleased or copy over from the 2 11 branch. Either way feel free to get this in and I will double check before the next release. (Of whatever branch)

Copy link
Member

@joshmoore joshmoore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've hopefully handled the conflicts with Greg's v3 work. Once this goes green, I'll merge and update the release notes to include previous versions.

@joshmoore
Copy link
Member

joshmoore commented May 4, 2022

It looks like the semantics of the update() method have changed with the refactoring for v3:

previous noise
pytest zarr/tests/test_convenience.py -svx -k2

...snip...

>           store_to_open.update(store)  # copy original store to new unlistable store

zarr/tests/test_convenience.py:305:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <zarr.storage.FSStore object at 0x1ba010970>, other = '/var/folders/z5/txc_jj6x5l5cm81r56ck1n9c0000gn/T/tmp0gy4ji7i', kwds = {}

    def update(self, other=(), /, **kwds):
        ''' D.update([E, ]**F) -> None.  Update D from mapping/iterable E and F.
            If E present and has a .keys() method, does:     for k in E: D[k] = E[k]
            If E present and lacks .keys() method, does:     for (k, v) in E: D[k] = v
            In either case, this is followed by: for k, v in F.items(): D[k] = v
        '''
        if isinstance(other, Mapping):
            for key in other:
                self[key] = other[key]
        elif hasattr(other, "keys"):
            for key in other.keys():
                self[key] = other[key]
        else:
>           for key, value in other:
E           ValueError: not enough values to unpack (expected 2, got 1)

That doesn't appear to be related to the type hierarchy:

2.11.2: FSStore.__mro__: (<class 'zarr.storage.FSStore'>, <class 'zarr._storage.store.Store'>, <class 'zarr._storage.store.BaseStore'>, <class 'collections.abc.MutableMapping'>, <class 'collections.abc.Mapping'>, <class 'collections.abc.Collection'>, <class 'collections.abc.Sized'>, <class 'collections.abc.Iterable'>, <class 'collections.abc.Container'>, <class 'object'>)
2.12.x: FSStore.__mro__: (<class 'zarr.storage.FSStore'>, <class 'zarr._storage.store.Store'>, <class 'zarr._storage.store.BaseStore'>, <class 'collections.abc.MutableMapping'>, <class 'collections.abc.Mapping'>, <class 'collections.abc.Collection'>, <class 'collections.abc.Sized'>, <class 'collections.abc.Iterable'>, <class 'collections.abc.Container'>, <class 'object'>)

Still digging in case anyone has an idea.

Edit: the problem was just that store had become a str. 🤦🏽 Fix pushed.

@joshmoore joshmoore merged commit ab6b355 into zarr-developers:master May 4, 2022
@rabernat
Copy link
Contributor Author

rabernat commented May 4, 2022

Thanks for finishing this up Josh! I really appreciate it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot create FSStore for unlistable http URLs
5 participants