-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preprocessor to convert calendar and "fill-in" the data #2106
Comments
Hi @larsbarring many thanks for proposing this! We have a preprocessor called regrid_time that aligns time points from cubes with differing time axes on a common standardized time axis, and that should account for differing calendars too (I believe I made sure that calendars were taken care of via conversion from num to date via a standard calendar, when I wrote that), have you seen/tried it? Of course, we can always generalize it or add a new eg |
HI @valeriupredoi, many thanks for quick the response! No, I did not know of And, yes, I do take your point about |
Our I am not sure I understand the comment on fill value. @larsbarring suggested that the fill value could possibly be used as one marker value. That would be possible with 1e20 as with any other; regardless, that applies to masking, I think. It still is perfectly reasonable to use nans for cases where nans make sense, no? I think a new |
ah then - very good and informative comments from both you gents @larsbarring and @zklaus 👍 I am annoyed with myself I didn't fix the calendar business in regrid_time TBH but then again, that function doesn't really help much when it comes to frequent data (daily etc). Then it does indeed sound like a good idea for a preprocessor! About xarray - good idea, we are directly involved with iris, so prob best we go via iris first (especially since I believe there still is an effort to merge forces, iris and xarray, I mean). Cheers @zklaus - I think I overthinked the missing/fill value issue 😁 |
This may also be relevant as an option when adding days to a |
We just encountered problems (again) with In the meantime, I propose to implement a "workaround" for monthly and yearly data. For those, it should be sufficient to simply take the 15th of the month (for monthly data) or 1 July (for yearly data) and assign those dates to the data using a fixed calendar (most likely This would be very simple to implement. Moreover, we are already doing exactly the same for
If we agree on this, I can try to implement that. |
How should this be implemented? I can think of two solutions at the moment:
I think 1. is the better solution, and we could even think of changing the default behavior in the future (with a proper deprecation cycle). @ESMValGroup/technical-lead-development-team any opinions? |
In relation to what @schlunma wrote:
I would like to just state the obvious difference between an intensive and an extensive quantity. For the former, simply adjusting the time coordinate should be fine, but for the latter differences in period length between different dataset calendars should be factored in by adjusting the data as such, and not only the time coordinate. |
Any improvements to |
In principle, I agree with that. However, I fear that it will take some time to implement this properly for all calendar combinations and frequencies (in addition, there is still no perfect solution for bridging iris and xarray; see SciTools/iris#4994), and I am on a deadline here (I need this for our EGU abstract). Thus, I would propose to expand the existing |
Improving regrid_time a bit sounds fine to me. We can deprecate it as soon as someone has time to actually work on this issue and implement things properly.
This may not be a huge problem in this case, it should be possible to just convert the time coordinate and time dependent data/coordinates etc separately to an Xarray Dataset and use the resulting values to make a new cube. |
A PR with improvements to |
Is your feature request related to a problem? Please describe.
There are many use-cases when model data and observational datasets are combined for some analyses. When the datasets have daily resolution non-
standard
model calendars cause problems. In addition, for certain analyses (e.g. of climate indicators related to spell length or day-of-year when something happens) the leap day of thestandard
calendar is a complication. Hence a preprocessor to convert calendars would be very useful, and allow for a common approach within a wide community for solving a problem that otherwise, and traditionally, "everyone" is solving in some ad hoc way by a quick-and-dirty fix (in the worst case again and again).Hence, we propose the following conversion table (numbers are explained below)
360_day
365_day
standard
gregorian
proleptic_ gregorian
366_day
julian
none
360_day
365_day
standard
gregorian
proleptic_ gregorian
366_day
julian
none
standard
gregorian
calendar is deprecated(*) Fill-in: For transformation from
365_day
calendar tostandard
orproleptic_gregorian
calendar it is suggested to add a day after February 28th (day-in-year 59). For transformation from the360_day
calendar several days have to be added:For conversion to a non-leap year the following days should be inserted (day-in-year in parenthesis):
For conversion to a leap year the following days should be inserted (day-in-year in parenthesis):
This follows what has been implemented in xarray (xarray.Dataset.convert_calendar using
align_to = "year"
) . However, as is indicated in the table above, we suggest not to implement transformations to the360_day
calendar, or the xarray alternativealign_to = "date"
because it removes several days.We also suggest the following alternatives for "creating" fill-in data:
NaN
, or similar like_FillValue
Would you be able to help out?
Would you have the time and skills to implement the solution yourself?
The text was updated successfully, but these errors were encountered: