Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attrs are lost in mathematical computation #1271

Closed
fujiisoup opened this issue Feb 15, 2017 · 7 comments
Closed

Attrs are lost in mathematical computation #1271

fujiisoup opened this issue Feb 15, 2017 · 7 comments
Labels
topic-metadata Relating to the handling of metadata (i.e. attrs and encoding)

Comments

@fujiisoup
Copy link
Member

Related to #138

Why is keep_attrs option in reduce method set to FALSE by default?
I feel it is more natural to keep all the attrs after some computation that returns xr.DaraArray
Such as data*100.
(By it is not possible to set this option TRUE when using an operator.)
Is it an option to move this option to init method, in case of TRUE all the attrs are tracked after computations of the object and also the object generated from this object?

@shoyer
Copy link
Member

shoyer commented Feb 16, 2017

Historically, we dropped attributes in arithmetic because attributes are often used for units and we didn't want to do computation that results in the wrong units. Blinding propagating metadata can lead to it ending up in the wrong place.

Also, if you are combining multiple DataArray objects with different attrs, there are a number of options for combining them, and it wasn't obvious which is the right strategy.

But in practice, this is something that a lot of people want (better to have stale metadata than no metadata at all), so maybe I made the wrong choice here. At the very least, I would be comfortable with an option to set keep_attrs=True for every operation and/or to specify a merge strategy for combining attrs in arithmetic. (I guess I've changed my opinion from when rejected this proposal from @jhamman several years ago.)

@fujiisoup
Copy link
Member Author

Thank you for the information.
I agree that we should not strongly rely on attrs. Unit may change in arithmetic.

My sense is closest to the option 3 in #131.
Some attrs should be tracked and other should be dropped.
In the present stage, the most possible option is to add keep_attrs in xr.set_options?
Or any destructive change will be an option?
Such as to divide attrs into two kinds, units and metadata.

@fmaussion
Copy link
Member

divide attrs into two kinds, units and metadata

Coordinate reference systems (crs) are an example of attribute which is always valid after selection, arithmetic, or even reduction since they apply to the coordinates of the DataArray. A global option to keep certain attrs (like crs) would be very useful to downstream libraries like salem.

@dopplershift
Copy link
Contributor

Could the setting not be module-wide, but be set on DataArray instances themselves? That setting could then be inherited when new ones are created from arithmetic operations.

I'd like to have the option of specifying specific attributes that should be kept, or maybe dropped. It will be really annoying to need to keep copying the grid_mapping attribute.

@shoyer
Copy link
Member

shoyer commented Feb 17, 2017

Coordinate reference systems (crs) are an example of attribute which is always valid after selection, arithmetic, or even reduction since they apply to the coordinates of the DataArray.

I wonder if it might make sense to represent crs as a (scalar) coordinate, which would already be propagated in all unambiguous cases by xarray ops.

The main reason why this is annoying is that ds.crs would return a 0-dimensional DataArray, so you would need to write ds.crs.item() to access it directly (e.g., if you want to call a method on it). Though you could actually already use the accessor interface to fix this (see this gist for details).

Could the setting not be module-wide, but be set on DataArray instances themselves? That setting could then be inherited when new ones are created from arithmetic operations.

I would strong prefer to avoid making the xarray data model more complex by adding another type of metadata (e.g., in addition to dims, coords, attrs and name on DataArray). Whitelisting specific attrs as ones that should be preserved in application code via set_options (or encouraging users to subclass DataArray in a "safe" way) is preferable in that regard.

Either way, this issue is related to the fuller "hook" system for customized attribute handling that I imagined over in #988.

@fmaussion
Copy link
Member

I wonder if it might make sense to represent crs as a (scalar) coordinate, which would already be propagated in all unambiguous cases by xarray ops.

This looks like a brilliant idea, thanks!

@fujiisoup
Copy link
Member Author

I would strong prefer to avoid making the xarray data model more complex by adding another type of metadata (e.g., in addition to dims, coords, attrs and name on DataArray). Whitelisting specific attrs as ones that should be preserved in application code via set_options (or encouraging users to subclass DataArray in a "safe" way) is preferable in that regard.

Understood.
I close this issue.
I guess discussions for how this option would be implemented could be continued in #988.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-metadata Relating to the handling of metadata (i.e. attrs and encoding)
Projects
None yet
Development

No branches or pull requests

5 participants