Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New alignment option: "exact" without broadcasting OR Turn off automatic broadcasting #6806

Closed
dcherian opened this issue Jul 18, 2022 · 9 comments · Fixed by #8784
Closed

Comments

@dcherian
Copy link
Contributor

Is your feature request related to a problem?

If we have two objects with dims x and x1, then xr.align(..., join="exact") will pass because these dimensions are broadcastable.

I'd like a stricter option (join="strict"?) that disallows broadcasting.

Describe the solution you'd like

xr.align(
    xr.DataArray([1], dims="x"),
    xr.DataArray([1], dims="x1"),
    join="strict",
)

would raise an error.

It'd be nice to have this as a built-in option so we can use

with xr.set_options(arithmetic_join="strict"):
    ...

Describe alternatives you've considered

An alternative would be to allow control over automatic broadcasting through the set_options context manager., but that seems like it would be more complicated to implement.

Additional context

This turns up in staggered grid calculations with xgcm where it is easy to mistakenly construct very high-dimensional arrays because of automatic broadcasting.

@benbovy
Copy link
Member

benbovy commented Jul 18, 2022

This turns up in staggered grid calculations with xgcm where it is easy to mistakenly construct very high-dimensional arrays because of automatic broadcasting.

Do you have an example that illustrates this issue?

I'm curious to see if in that specific case using a custom index for staggered grid coordinates wouldn't work as an alternative solution. The alignment rules are pretty strict (same index type + same sequence of coordinate names & dims). Not 100% sure if that applies in the xgcm case, but in theory it would raise an error regardless of the join method if you try to align objects that have broadcastable coordinates with incompatible indexes.

@TomNicholas
Copy link
Member

I'm curious to see if in that specific case using a custom index for staggered grid coordinates wouldn't work as an alternative solution. The alignment rules are pretty strict (same index type + same sequence of coordinate names & dims). Not 100% sure if that applies in the xgcm case, but in theory it would raise an error regardless of the join method if you try to align objects that have broadcastable coordinates with incompatible indexes.

This is on my to-do list to think about really hard 😅

I think one problem would be that sometimes a different grid position implies a different length coordinate array, e.g. "outer" vs "center" is larger by one element.

@dcherian
Copy link
Contributor Author

Basically I want the following to raise an error because I've picked the wrong variable with dimension x1 instead of dimension x by mistake

xr.DataArray([1], dims="x") / xr.DataArray([1], dims="x1")

I think we could do it with a "staggered grid index", but it seems useful in general

@benbovy
Copy link
Member

benbovy commented Jul 19, 2022

Ah ok I see, thanks! Yes I agree this may be useful.

@max-sixty
Copy link
Collaborator

max-sixty commented Feb 10, 2024

@dcherian — I'm discussing this with @etienneschalk over at #8698

Can we clarify your proposal? Is it:

  1. All dims must match — i.e. ds1.dims == ds2.dims, as well as each dim being the same size as its counterpart?
  2. ...or something more muted, like each dim being the same size as its counterpart, and not broadcasting a dimension of size 1 to the same-named dimension of size n?

The first seems very strict, I haven't hit that personally, but possibly others have...

@dcherian
Copy link
Contributor Author

Thoughts from the meeting:

"join" is about indexes, and alignment and broadcasting should be orthogonal ops.

So instead of a new join option, we'll add an option to control the broadcasting. So something like:

  1. align(..., join='exact', broadcast=False) # new kwarg broadcast
  2. And a new global option arithmetic_broadcast to complement arithmetic_join . This will control automatic broadcasting in binary ops.

@etienneschalk
Copy link
Contributor

etienneschalk commented Feb 17, 2024

Hello @dcherian

I adapted changes made in #8698 with the added clear separation between the join and broadcast keyword parameters.

So while the form (API) changed, I did not change the current substance, and it is still not entirely clear for me, what is expected to implement this issue successfully. I wrote a small text following the discussions with @max-sixty about the changes I currently made, and my current understanding: #8698 (comment) ; is this aligned with what you would expect? (pun intended... 😅)

You can have a look at test_broadcast_on_vs_off_global_option in the PR, testing the behaviour you described in your comment

@dcherian
Copy link
Contributor Author

@etienneschalk I sincerely apologized. I deeply misunderstood what align was doing when I wrote this issue. align isn't actually broadcasting, it is simply checking indexes (as it should).

Given the discussion in the meeting, I think the simplest solution is to raise an error here if arithmetic_broadcast is False:

xarray/xarray/core/variable.py

Lines 2265 to 2268 in ff0d056

if reflexive and issubclass(type(self), type(other)):
other_data, self_data, dims = _broadcast_compat_data(other, self)
else:
self_data, other_data, dims = _broadcast_compat_data(self, other)

This should be a significantly smaller and easier PR.

Again, apologies for leading you down the wrong path.

@etienneschalk
Copy link
Contributor

No problem, even if the PR is not solving what was expected, it was the opportunity for me to dig into the xarray internal logic which is valuable learning!

I can try to implement your suggestion, reusing the existing 'arithmetic_broadcast' flag

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants