You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue outlines what will likely be needed on the napari side to support the upcoming zarr v3 spec. A primary difference introduced in v3 stores is that the data chunks and the corresponding metadata are stored in separate data/root/ and meta/root/ directory trees, respectively. Additionally the data chunks are stored in a nested file format rather than a flat file format by default. These changes are intended to make working with large data having many chunks more responsively, particularly over cloud storage.
We have recently merged support for the proposed zarr v3 spec in the main development branch of zarr-python and are starting to look at a few downstream projects for additional testing prior to release.
napari.utils.io.magic_imread
Currently uses guess_zarr_path (see below) to choose when to attempt opening with read_zarr_dataset. This function likely doesn't need any v3-specific changes.
napari.utils.io.guess_zarr_path
current behavior
looks for a folder ending with '.zarr' within the path
.zarray and .zgroup don't exist in v3. Array and group metadata are in a separate meta/root/ folder with filenames ending in.array.json and .group.json, respectively. (technically, the spec also allows specifying an extension other than .json for the metadata, but currently zarr-python always uses JSON)
The data is then read via dask.array.from_zarr which can already read v3 files if we pass a component kwarg (and possibly a zarr_version kwarg). See a WIP dask PR adding tests for this here: dask/dask#8918. That PR also shows example file listings, as a concrete example of the difference in default file layout for v3 vs. v2 stores.
I think for v3 we should be able to adapt to open an array when passed a path to the array metadata or a group of arrays when passed a path to group metadata. We could also open based on the path to the data folder of an array or group as well by traversing up the tree until we find the root path containing the required zarr.json metadata. We can then extract the desired array key name(s) to use as the component in calls like:
🧰 Task
This issue outlines what will likely be needed on the napari side to support the upcoming zarr v3 spec. A primary difference introduced in v3 stores is that the data chunks and the corresponding metadata are stored in separate
data/root/
andmeta/root/
directory trees, respectively. Additionally the data chunks are stored in a nested file format rather than a flat file format by default. These changes are intended to make working with large data having many chunks more responsively, particularly over cloud storage.We have recently merged support for the proposed zarr v3 spec in the main development branch of
zarr-python
and are starting to look at a few downstream projects for additional testing prior to release.napari.utils.io.magic_imread
Currently uses
guess_zarr_path
(see below) to choose when to attempt opening withread_zarr_dataset
. This function likely doesn't need any v3-specific changes.napari.utils.io.guess_zarr_path
current behavior
needed changes
napari.utils.io.read_zarr_dataset
current behavior
.zarray
is present, opens a single array.zgroup
is present, create a list containing all arrays in the group (currently used for multiscale, but Support for Zarr files with multiple datasets (groups) #1406 discusses potentially adding support for loading as separate layers)needed changes for v3
.zarray
and.zgroup
don't exist in v3. Array and group metadata are in a separatemeta/root/
folder with filenames ending in.array.json
and.group.json
, respectively. (technically, the spec also allows specifying an extension other than.json
for the metadata, but currently zarr-python always uses JSON)The data is then read via
dask.array.from_zarr
which can already read v3 files if we pass acomponent
kwarg (and possibly azarr_version
kwarg). See a WIP dask PR adding tests for this here: dask/dask#8918. That PR also shows example file listings, as a concrete example of the difference in default file layout for v3 vs. v2 stores.I think for v3 we should be able to adapt to open an array when passed a path to the array metadata or a group of arrays when passed a path to group metadata. We could also open based on the path to the data folder of an array or group as well by traversing up the tree until we find the root path containing the required
zarr.json
metadata. We can then extract the desired array key name(s) to use as the component in calls like:cc @joshmoore, @MSanKeys963, @jakirkham, @rabernat, @martindurant
The text was updated successfully, but these errors were encountered: