Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Object versioning... #14

Open
tam203 opened this issue Apr 11, 2019 · 5 comments
Open

Proposal: Object versioning... #14

tam203 opened this issue Apr 11, 2019 · 5 comments
Labels
protocol-extension Protocol extension related issue

Comments

@tam203
Copy link

tam203 commented Apr 11, 2019

I've written a blog post about this How to (and not to) handle metadata in high momentum datasets so for a more thorough dive please read that but in short:

I'm really interested in moving datasets backed by an object store (S3 in my case). S3 is eventually consistent and so there is an issue whenever you make changes to more than one object at (approximately) the same time since on read you don't know what combination of versions you'll get.

This could be an issue if I update .zarray to grow my array and also update .zattrs with some metadata to reflect this change. On read I could get the new metadata and old shape or via versa. Both which would be bad.

This becomes more pronounced when working complex datasets with coordinates etc, when saved as Zarr by Xarray these end up in different zarrays in the same group. But there is no tie to what version of any object you get and an update then read could result in all kinds of corruption.

Some of this needs to be resolved higher up the tooling (xarray, etc) but I think Zarr development needs to be aware of the challenge and support it.

@tam203
Copy link
Author

tam203 commented Apr 11, 2019

P.s. If you would rather suggestions such as these in the main zarr issues please let me know.

@tam203
Copy link
Author

tam203 commented Apr 11, 2019

I also wonder if dropping .zattrs and making it a property of .zarray or .zgroup would help this somewhat and maybe have other advantages (and disadvantages).

@jstriebel jstriebel added the protocol-extension Protocol extension related issue label Nov 16, 2022
@jstriebel
Copy link
Member

Maybe also related to #76?

@rabernat
Copy link
Contributor

Also very closely related to #154. Joe and I are working intensely on versioning right now.

@jakirkham
Copy link
Member

Also issue ( #82 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
protocol-extension Protocol extension related issue
Projects
None yet
Development

No branches or pull requests

4 participants