-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xcube to write sparse zarrs #688
Comments
Hi @AliceBalfanz, is it normal that it takes so long to import the improvements introduced in the used libraries? |
@AliceBalfanz and @elliot-lvs can you please explain which function you use to write your datasets to disk? |
|
Can you explain necessary please? Do you mean, it has been faster once?
There is no special code in xcube that should prevent the |
Necessary was just a reference to the timings obtained concerning the little data written (even with an SSD). I honestly don't know if it was faster before but, by not using rasterize_features, I get the impression that more data is being written for the same time. I'm working with a 2-3Mb Zarr and it takes less than a second to rasterize_features() and more than 2 minutes to write it on the SSD. |
Just checked, the problem is, that xarray is not exploiting the new Zarr feature to not write empty chunks introduced in Zarr 2.11. When forcing related Zarr encoding option encodings = {
var_name: {**var.encoding, "write_empty_chunks": False}
for var_name, var in dataset.data_vars.items()
}
dataset.to_zarr(path, mode="w", encoding=encodings)
References
|
Is your feature request related to a problem? Please describe.
When I create an xcube dataset in zarr, all chunks are written to disk, even ones which are all NaNs. after writing the cube, I apply xcube prune to get rid of empty chunks to save disk space. zarr introduced in a recent release https://zarr.readthedocs.io/en/stable/release.html#release-2-11-0 that nan chunks are not written to disk. xcube should use that right away to save space and (user) time.
To reproduce:
I have an xcube env with zarr version 2.11.3
The text was updated successfully, but these errors were encountered: