Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require same datatype for all arrays in multiscales:datasets #154

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

constantinpape
Copy link
Contributor

Fixes #2

I think it is important to include this change rather soon. Right now data with different dtypes across the dimensions is technically valid according to the spec, but will cause issues for several implementations.
(It does not work in MoBIE; I am pretty sure the same is true for all tools based on BigDataViewer, and I suspect any tool implemented in a strongly typed language will have issues with this).

@github-actions
Copy link
Contributor

Automated Review URLs

Copy link
Member

@will-moore will-moore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. A couple of minor comments

to the current zarr group. The "path"s MUST be ordered from largest (i.e. highest resolution) to smallest.

Each "datasets" dictionary MUST have the same number of dimensions and MUST NOT have more than 5 dimensions. The number of dimensions and order MUST correspond to number and order of "axes".
All arrays referenced in "datasets" MUST have the same number of dimensions and MUST have the same datatype. The number of dimensions MUST NOT be more than 5 dimensions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The number of dimensions MUST NOT be more than 5 dimensions" - I think this rule is due to be removed in v0.5? cc@bogovicj I don't know if that was going to happen in #138 but I don't see that it has yet. So it may need a rebase etc if this is merged first to avoid conflicts

@@ -365,10 +366,11 @@ The order of the entries MUST correspond to the order of dimensions of the zarr
If there are three spatial axes where two correspond to the image plane ("yx") and images are stacked along the other (anisotropic) axis ("z"), the spatial axes SHOULD be ordered as "zyx".

Each "multiscales" dictionary MUST contain the field "datasets", which is a list of dictionaries describing the arrays storing the individual resolution levels.
Each dictionary in "datasets" MUST contain the field "path", whose value contains the path to the array for this resolution relative
Each dictionary in "datasets" MUST contain the field "path", whose value contains the path to the array for this resolution relative.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full stop added accidentally? The sentence continues on the next line

@@ -153,7 +153,8 @@ For this example we assume an image with 5 dimensions and axes called `t,c,z,y,x
│ ... # which is a folder containing chunk files which compose the array.
├── n # The name of the array is arbitrary with the ordering defined by
│ │ # by the "multiscales" metadata, but is often a sequence starting at 0.
│ │
│ │ # All arrays must have the same datatype and the same number of dimensions.
| |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor point: Looks like 2 lines have changed because whitespace was added to the "blank" line. | | Just adding 1 line is preferable.

@clbarnes
Copy link

clbarnes commented Jul 5, 2023

I suppose one place where this might not be wanted would be in label datasaets where the bottom level is just a voxel array and higher levels are variable length as labels get collapsed together? Storing that bottom level as variable length could potentially be costly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Constrain dtypes & shape for multiscale arrays
3 participants