Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating an Zarr.group using fsspec.FSMap fails #1353

Closed
Swordcat opened this issue Feb 22, 2023 · 5 comments
Closed

Creating an Zarr.group using fsspec.FSMap fails #1353

Swordcat opened this issue Feb 22, 2023 · 5 comments
Labels
bug Potential issues with the zarr-python library

Comments

@Swordcat
Copy link
Contributor

Zarr version

v2.14.1

Numcodecs version

v0.11.0

Python Version

3.11

Operating System

MacOs 13.2.1

Installation

using pip into a virtual environment

Description

creating a group using the Zarr.group convenience function with a fsspec.FSMap store fails as mode='w' is not passed to _normalize_store_arg.

This error stems from a change made in #1304 where a fsspec.FSMap is promoted to a fsspec.FSStore within the zarr.storage._normalize_store_arg_v2 function using the default mode of read only.

A fix for array creation (Zarr.create) due to a similar error was implemented with #1309

I propose introducing a similar change to the Zarr.group function, changing the first line of the function from store = _normalize_store_arg(store, zarr_version=zarr_version) to store = _normalize_store_arg(store, zarr_version=zarr_version, mode='w')

Steps to reproduce

The error can be reproduced with the following code

from fsspec.implementations.memory import MemoryFileSystem

import zarr

mfs = MemoryFileSystem()
fsmap = mfs.get_mapper("memory:/tmp/store")
group = zarr.group(store=fsmap)

Console output

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/brandur/Documents/Repositories/zarr-python-dev/zarr/hierarchy.py", line 1355, in group
    init_group(store, overwrite=overwrite, chunk_store=chunk_store,
  File "/Users/brandur/Documents/Repositories/zarr-python-dev/zarr/storage.py", line 648, in init_group
    _init_group_metadata(store=store, overwrite=overwrite, path=path,
  File "/Users/brandur/Documents/Repositories/zarr-python-dev/zarr/storage.py", line 711, in _init_group_metadata
    store[key] = store._metadata_class.encode_group_metadata(meta)  # type: ignore
    ~~~~~^^^^^
  File "/Users/brandur/Documents/Repositories/zarr-python-dev/zarr/storage.py", line 1410, in __setitem__
    raise ReadOnlyError()
zarr.errors.ReadOnlyError: object is read-only

Additional output

No response

@Swordcat Swordcat added the bug Potential issues with the zarr-python library label Feb 22, 2023
@rabernat
Copy link
Contributor

Flagging as a duplicate of #1352.

@martindurant - any thoughts here?

@rabernat
Copy link
Contributor

BTW, thanks a lot for reporting this @Swordcat! We really appreciate it, and we're sorry for this regression!

I propose introducing a similar change to the Zarr.group function, changing the first line of the function from store = _normalize_store_arg(store, zarr_version=zarr_version) to store = _normalize_store_arg(store, zarr_version=zarr_version, mode='w')

This sounds like a great way forward!

Swordcat pushed a commit to Swordcat/zarr-python that referenced this issue Feb 22, 2023
rabernat pushed a commit that referenced this issue Feb 23, 2023
* Fix creating a group with fsmap per issue #1353, regression test added

* Update release notes
@triciajam
Copy link

I can reproduce the bug with the following setup:

zarr 2.13.6
python 3.10.9
numcodecs 0.11.0
OS AmazonLinux2
Installation conda

And also using s3fs:

import s3fs
import zarr

bucket_name = "s3_bucket_I_can write to"
s3 = s3fs.S3FileSystem(anon=False)
store = s3fs.S3Map(root=f'{bucket_name}/a.zarr', s3=s3, check=False)
zarr_store = zarr.group(store=store)
zarr_store

output:

ReadOnlyError                             Traceback (most recent call last)
Cell In[53], line 6
      4 s3 = s3fs.S3FileSystem(anon=False)
      5 store = s3fs.S3Map(root='daskzarrstack-daskzarr87beea34-w35ea5m24oq4/a.zarr', s3=s3, check=False)
----> 6 zarr_store = zarr.group(store=store)
      7 zarr_store
      9 # with s3.open('daskzarrstack-daskzarr87beea34-w35ea5m24oq4/new-file', 'wb') as f:
     10 #     f.write(2*2**20 * b'a')

File ~/anaconda3/envs/zarr_py310_nb/lib/python3.10/site-packages/zarr/hierarchy.py:1355, in group(store, overwrite, chunk_store, cache_attrs, synchronizer, path, zarr_version)
   1352     requires_init = overwrite or not contains_group(store, path)
   1354 if requires_init:
-> 1355     init_group(store, overwrite=overwrite, chunk_store=chunk_store,
   1356                path=path)
   1358 return Group(store, read_only=False, chunk_store=chunk_store,
   1359              cache_attrs=cache_attrs, synchronizer=synchronizer, path=path,
   1360              zarr_version=zarr_version)

File ~/anaconda3/envs/zarr_py310_nb/lib/python3.10/site-packages/zarr/storage.py:643, in init_group(store, overwrite, path, chunk_store)
    640     store['zarr.json'] = store._metadata_class.encode_hierarchy_metadata(None)  # type: ignore
    642 # initialise metadata
--> 643 _init_group_metadata(store=store, overwrite=overwrite, path=path,
    644                      chunk_store=chunk_store)
    646 if store_version == 3:
    647     # TODO: Should initializing a v3 group also create a corresponding
    648     #       empty folder under data/root/? I think probably not until there
    649     #       is actual data written there.
    650     pass

File ~/anaconda3/envs/zarr_py310_nb/lib/python3.10/site-packages/zarr/storage.py:706, in _init_group_metadata(store, overwrite, path, chunk_store)
    704 key = _prefix_to_group_key(store, _path_to_prefix(path))
    705 if hasattr(store, '_metadata_class'):
--> 706     store[key] = store._metadata_class.encode_group_metadata(meta)  # type: ignore
    707 else:
    708     store[key] = encode_group_metadata(meta)

File ~/anaconda3/envs/zarr_py310_nb/lib/python3.10/site-packages/zarr/storage.py:1398, in FSStore.__setitem__(self, key, value)
   1396 def __setitem__(self, key, value):
   1397     if self.mode == 'r':
-> 1398         raise ReadOnlyError()
   1399     key = self._normalize_key(key)
   1400     path = self.dir_path(key)

ReadOnlyError: object is read-only

Confirmed that I can write to the bucket using s3fs - this works.

with s3.open(f'{bucket_name}/test', 'wb') as f:
    f.write(b"Test line")

Do you have a timeline for when the fixes will be available in pypi or conda? Much appreciate your work on this.

@rabernat
Copy link
Contributor

I see no reason why we shouldn't make a release now to fix this rather severe bug.

@rabernat
Copy link
Contributor

Zarr 2.14.2 is now on pypi. Conda forge will take a little longer.

Closed by #1354.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Potential issues with the zarr-python library
Projects
None yet
Development

No branches or pull requests

3 participants