You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been using the latlon dataset here gs://gcp-public-data-arco-era5/ar/full_37-1h-0p25deg-chunk-1.zarr-v3. It has been extremely helpful for setting up different projects. I am wondering if it would be possible to rechunk the pressure level data. Currently all pressure levels are in a single chunk. If we want to sub sample we will end up getting the entire chunk which can significantly slow down the bandwidth. Ideally given this is in object storage we could use much smaller chunk sizes and just have the chunks be the lat long grid. What do you thinks?
The text was updated successfully, but these errors were encountered:
This is a lot of data, so I don't think we're going to store another duplicate version of this dataset. But there are a number of tools for rechunking the data yourself, e.g., see rechunker or xarray-beam
I have a similar issue. If I am only interested into a region of the world and a subset of the data, I would have to download 1PB for only a few hundred megabyte. I do not thinks that is the idea.
So there is not possibility to download only a subset of the remote chunk? We have always to download the full chunk and than rechunk local?
I have been using the latlon dataset here
gs://gcp-public-data-arco-era5/ar/full_37-1h-0p25deg-chunk-1.zarr-v3
. It has been extremely helpful for setting up different projects. I am wondering if it would be possible to rechunk the pressure level data. Currently all pressure levels are in a single chunk. If we want to sub sample we will end up getting the entire chunk which can significantly slow down the bandwidth. Ideally given this is in object storage we could use much smaller chunk sizes and just have the chunks be the lat long grid. What do you thinks?The text was updated successfully, but these errors were encountered: