-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataArray.set_index can add new dimensions that are absent on the underlying array. #9278
Comments
In your example
|
Yes, possibly we should raise an error on this? Possibly our indexing work means we can now create multiple indexes on a dimension, so we want to be able to Regardless, I'm not sure what the |
You've created a new variable named |
But there's no dimension named
|
AH now I see. Yes that's a bug. |
What is your issue?
I know we've done lots of great work on indexes. But I worry that we've made some basic things confusing. (Or at least I'm confused, very possibly I'm being slow this morning?)
A colleague asked how to group by multiple coords. I told them to
set_index(foo=coord_list).groupby("foo")
. But that seems to work inconsistently.Here it works great:
Then make the two coords into a multiindex along
d
and group byd
— we successfully get a value for each of the three values ond
:But here it doesn't work as expected:
Then we try grouping by
combined
, and we get a value for every value ofcombined
andtime
?I'm guessing the reason is that
combined (combined) object 48B MultiIndex
is treated as a dimension?<xarray.DataArray (time: 6)> Size: 48B
as the dimensions. What's a good mental model for this?reindexed.groupby('combined').mean(...)
orreindexed.groupby('combined').mean('time')
to reduce over thetime
dimension. But that gets even more confusing — we then don't reduce over the groups ofcombined
!To the extent this sort of point is correct and it's not just me misunderstanding: I know we've done some really great work recently on expanding xarray forward. But one of the main reasons I fell in love with xarray 10 years ago was how explicit the API was relative to pandas. And I worry that this sort of thing sets us back a bit.
The text was updated successfully, but these errors were encountered: