Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add chunks argument to {zeros/ones/empty}_like. #5144

Closed
nbren12 opened this issue Apr 12, 2021 · 5 comments
Closed

Add chunks argument to {zeros/ones/empty}_like. #5144

nbren12 opened this issue Apr 12, 2021 · 5 comments
Labels
topic-arrays related to flexible array support topic-dask

Comments

@nbren12
Copy link
Contributor

nbren12 commented Apr 12, 2021

Describe the solution you'd like

We have started using xarray objects as "schema" for initializing zarrs that will be written to using the region argument of to_zarr. For example,

output_schema.to_zarr(path, compute=False)
for region in regions:
    output = func(input_data.isel(region))
    output.to_zarr(path, region=region)

Currently, xarray's tools for computing the output_schema Dataset are a lacking since rechunking existing datasets can be slow. dask.array.zeros_like takes a chunks argument, can we add one here too?

Describe alternatives you've considered

.chunk

@dcherian dcherian added topic-dask topic-arrays related to flexible array support labels Jul 4, 2021
@max-sixty
Copy link
Collaborator

If #8251 were solved, we could do:

xr.DataArray().chunk(42).broadcast_like(ds)

In the place of that, we can do a hack:

xr.DataArray([1]).chunk(2).broadcast_like(ds).squeeze('dim_0')

(where broadcast_like can also be adding on dimensions with .expand_dims — not sure exactly how you prefer to initialize the arrays)

I would have a preference for using orthogonal methods like .chunk, rather than adding more kwargs to methods.

Do you think that works for your case @nbren12 ?

@dcherian
Copy link
Contributor

dcherian commented Oct 24, 2023

rather than adding more kwargs to methods.

I think these array creation functions are almost always going to be messy if we want to support all the array types in them. It just happens that there is a dask-specific method. Like we might want to pass format for pydata/sparse (example)

We could accept a array_like argument, and pass **kwargs to the appropriate library though it turns out numpy.zeros does accept like but numpy.zeros_like does not accept like. I think the inconsistency is OK. I've never missed xarray.zeros

@max-sixty
Copy link
Collaborator

We could accept a array_like argument, and pass **kwargs to the appropriate library

+1

@dcherian
Copy link
Contributor

Wait this was added at some point: https://docs.xarray.dev/en/stable/generated/xarray.zeros_like.html

xarray.zeros_like(other, dtype=None, *, chunks=None, chunked_array_type=None, from_array_kwargs=None)

@max-sixty
Copy link
Collaborator

Ha! OK, well, ask and ye shall receive...

(I certainly don't feel strongly enough to suggest removing it...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-arrays related to flexible array support topic-dask
Projects
None yet
Development

No branches or pull requests

3 participants