-
-
Notifications
You must be signed in to change notification settings - Fork 404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added support for cftime types #2728
Conversation
Thanks so much for taking this on! It will make climate modelers very happy. I think I need to defer to @spencerclark or @shoyer to answer your questions, as they understand the details better. |
@philippjfr @rabernat sorry I wasn't pinged here (note in the future my GitHub user name is "spencerkclark" instead of "spencerclark"), so I'm just seeing these questions now. Thanks for starting to work on this!
In principle any of the types could appear in the wild, so ideally all of them, but if we wanted to prioritize a couple and save the rest for later, the most common would probably be
Yes, all
There is one more date type that
They'll probably appear the most in xarray and Iris data structures (since both support them natively), but they could show up in pure NumPy arrays (with dtype "object") or in Pandas DataFrames.
With regard to matplotlib support, it might be worth looking at nc-time-axis. I think Iris uses it internally (e.g. here) for their built-in plotting functionality. |
72d71a0
to
18d94de
Compare
I've tried playing with nc-time-axis a bit but I can't actually get it to plot cftime objects with non-standard reference units, e.g. I haven't found any way to plot dates like this:
My vote is to merge this without matplotlib support for now. |
elif pd and isinstance(date, pd.Timestamp): | ||
date = date.to_pydatetime() | ||
if hasattr(date, 'timetuple'): | ||
dt_int = calendar.timegm(date.timetuple())*1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing this makes me a little uneasy with respect to use with cftime types. I think the built-in Python calendar
module always assumes the Gregorian calendar is used; in cftime, different date types operate on different calendars, so the "seconds since 1970-01-01T00:00:00" value could be different for the same time tuple, depending on the calendar.
But perhaps there's good reason for doing things this way (I'm not familiar with how dates are handled in bokeh/holoviews)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, definitely need another look at this. Is there a simple way to convert the values to a Gregorian calendar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see; this is appropriate if that is what you are looking to do.
However, I'm not sure if this is the ideal approach in general; for instance this will cause there to be extra space (and potentially a tick label) in the place of February 29th, 2000 for a no leap calendar (even though in that calendar that date does not exist). See an example in a gist with this branch. There are other possible edge cases too (in a 360-day calendar some dates do not exist in a Gregorian calendar, like February 30th).
I think nc-time-axis handles these edge cases properly, because it does not rely on converting dates to the Gregorian calendar. See how it handles the example above in this gist (I've also included an example of how to plot with dates generated from xarray.cftime_range
).
I understand if it would be a lot more work to solve this in general though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again, I'll look at nc_time_axis in a bit more detail, running your notebook is currently giving me an error though, which I had already filled here: SciTools/nc-time-axis#40.
ValueError: invalid year provided in cftime.DatetimeNoLeap(0, 11, 27, 1, 12, 0, 4, 3, 331)
What version of nc_time_axis and cftime have? I think I have the latest version of both:
print(cftime.__version__)
print(nc_time_axis.__version__)
1.0.0
1.1.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing, thanks for the link to that issue. In my notebook I was using cftime 1.0.2.1 and nc_time_axis 1.1.0. I can reproduce your issue if I downgrade cftime to version 1.0.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll try upgrading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I think we'll likely have to push ahead with the subtly wrong behavior in bokeh, as a proper solution would require writing custom bokeh models to correctly format dates in different calendars. Would you expect a warning when plotting dates in a non-Gregorian calendar until we've got a proper fix?
18d94de
to
88bebf3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for following through with this. I really appreciate you adding support for these climate-science-specific date types!
Okay, I think we'll likely have to push ahead with the subtly wrong behavior in bokeh, as a proper solution would require writing custom bokeh models to correctly format dates in different calendars.
This is totally fine. I'm not surprised it would require a custom solution in bokeh. For now users can always switch to the matplotlib backend if they want things to be exactly correct.
Would you expect a warning when plotting dates in a non-Gregorian calendar until we've got a proper fix?
If it's not too much trouble I do think a warning would be helpful in the bokeh case. For example we now warn in xarray whenever this kind of conversion is done. At a minimum I think it should be publicly documented somewhere.
holoviews/plotting/mpl/util.py
Outdated
'using:\n\tconda install -c conda-forge ' | ||
'nc_time_axis') | ||
if isinstance(value, cftime_types): | ||
value = CalendarDateTime(cftime.datetime(*value.timetuple()[0:6]), value.calendar) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for converting to the generic cftime.datetime
here and below? I think you can just pass value
and value.calendar
to the CalendarDateTime
constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, don't think there was any good reason.
holoviews/plotting/bokeh/util.py
Outdated
Attempts highest precision conversion of different datetime | ||
formats to milliseconds since the epoch (1970-01-01 00:00:00). | ||
If datetime is a cftime with a non-standard calendar the | ||
caveats describd in cftime_to_timestamp apply. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in "described"
holoviews/plotting/bokeh/util.py
Outdated
are converted to standard Gregorian calendar. This can cause | ||
extra space to be added for dates that don't exist in the original | ||
calendar. In order to handle these dates correctly a custom bokeh | ||
model with support for other calendars would have to be defind. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in "defined."
holoviews/core/data/xarray.py
Outdated
cftime._cftime.Datetime360Day, | ||
cftime._cftime.DatetimeJulian, | ||
cftime._cftime.DatetimeNoLeap, | ||
cftime._cftime.DatetimeProlepticGregorian |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are missing cftime.DatetimeAllLeap
here. That said, all of these are subclasses of cftime.datetime
, so I think you could get away with just using cftime.datetime
here, rather than enumerating all of the different subclasses (since cftime_types
is only used for instance checks).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's much easier thanks.
Thanks, I've adapted the error message from xarray for our purposes, e.g.:
|
@jlstevens This should be ready to review and merge now. If you're wondering about all the other changes, the dt_to_int utility was all kinds of horrendously wrong with its unit conversions, which was made up for by counter-acting it with various scaling factors in the histogram and datashading operations. The operation now actually converts between time resolutions correctly. |
Thanks for tackling this! Tests are green so I'll merge now. |
Thanks again @philippjfr! |
Thank you for being so helpful @spencerkclark! |
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
This PR adds support for
cftime
types which are a custom time type used by xarray for dates that cannot be stored in a datetime64 type. A small example:This approach seems to work well but there are a questions I have, which @rabernat and @spencerclark hopefully can answer:
cftime
provides: