Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Date rounding with daylight saving time #640

Closed
tomwwagstaff opened this issue Feb 20, 2018 · 12 comments
Closed

Date rounding with daylight saving time #640

tomwwagstaff opened this issue Feb 20, 2018 · 12 comments

Comments

@tomwwagstaff
Copy link

tomwwagstaff commented Feb 20, 2018

y <- floor_date(x, unit = "month")

Returns values for y at 00:00 during GMT, but 01:00 during BST.

This poster on Stack Overflow seems to be experiencing similar issues with ceiling_date: https://stackoverflow.com/questions/48125040/lubridate-ceiling-date-bug-with-daylight-savings.

My workaround has been to convert all datetimes to GMT, but I think this means that any payments between 00:00 and 01:00 during Summer will be allocated to the wrong date.

Another workaround was this:
y <- floor_date(x, unit = "month") %>% floor_date(unit = "day")
but I'm worried it'll suffer from the same problem.

@vspinu
Copy link
Member

vspinu commented Feb 20, 2018

Reproducible example? Version of R, lubridate, OS?

I cannot reproduce the SO issues. Very likely the problem is no longer there in the newer versions of lubridate:

> dt_1 <- lubridate::ymd("2017-10-01", tz = "Australia/Adelaide") %>%
+     magrittr::add(lubridate::hours(c(0,1,23,24)))
> 
> dt_2 <- lubridate::ymd("2017-04-02", tz = "Australia/Adelaide") %>%
+     magrittr::add(lubridate::hours(c(0,1,23,24)))
> 
> lubridate::ceiling_date(dt_1, unit = "days")
[1] "2017-10-01 ACST" "2017-10-02 ACDT" "2017-10-02 ACDT" "2017-10-02 ACDT"
> lubridate::ceiling_date(dt_2, unit = "days")
[1] "2017-04-02 ACDT" "2017-04-03 ACST" "2017-04-03 ACST" "2017-04-03 ACST"

@tomwwagstaff
Copy link
Author

Well that SO post is only 1 month old, and I'm using v1.7.2 on R 3.4.2 on Windows 10.

I cannot reproduce the SO problem either, nor can I reproduce my own problem on a contrived dataset. But let me illustrate what's happening with force_tz...

Creating a new date and then changing the timezone works as expected:

example <- as.POSIXct("2017-08-13", "Europe/London")
example
[1] "2017-08-13 BST"
example %>% force_tz(tzone = "Etc/GMT")
[1] "2017-08-13 GMT"
example %>% with_tz(tzone = "Etc/GMT")
[1] "2017-08-12 23:00:00 GMT"

However, performing the same operations on the same value in my dataset, force_tz fails:

dateRange[1]
[1] "2017-08-13 BST"
dateRange[1] %>% force_tz(tzone = "Etc/GMT")
[1] "2017-08-12 23:00:00 GMT"
dateRange[1] %>% with_tz(tzone = "Etc/GMT")
[1] "2017-08-12 23:00:00 GMT"

And these values are identical:

example == dateRange[1]
[1] TRUE
example %>% force_tz(tzone = "Etc/GMT") == dateRange[1] %>% force_tz(tzone = "Etc/GMT")
[1] FALSE

Crazy, no?

@tomwwagstaff
Copy link
Author

Okay, I've fixed the problem. Here's the solution:

attr(dateRange, "tzone") <- "Europe/London"

Now dateRange[1] behaves the same as example. The timezone for dateRange[1] was "", despite showing up as BST when printing the value.

Maybe this isn't a bug as such, but it is unexpected behaviour that can wrong-foot new users. Is it worth adding a note in the documentation about explicitly setting time zones, even when they're apparently already set?

(And of course this doesn't answer the original SO question, where time zones were explicitly set by the user.)

@vspinu
Copy link
Member

vspinu commented Feb 21, 2018

I see. Could you please post the output of the following:

Sys.timezone()
Sys.getenv("TZ")
lubridate:::C_local_tz()
as.POSIXct("2017-08-13", tz = "")
ymd("2017-08-13", tz = "")
force_tz(as.POSIXct("2017-08-13", tz = ""), "UTC")

I bet this one comes from the discrepancy between Sys.timezone and the timezone inferred by R when tz="". If so it's a version of #619. I thought this is an issue with R-devel only but looks like it goes back to R3.4.2 at the least.

In nutshell, there is no way to determine the current time zone (aka the time zone used by R when tz="") neither from R itself nor from the C code. I pointed this to R folks but no-one seem to give a damn. So from next lubridate version I will try to remove all the dependency on as.POSIXlt just to avoid dealing with this issue.

@tomwwagstaff
Copy link
Author

Certainly, here you go:

> Sys.timezone()
[1] "Europe/London"
> Sys.getenv("TZ")
[1] ""
> lubridate:::C_local_tz()
[1] "Europe/London"
> as.POSIXct("2017-08-13", tz = "")
[1] "2017-08-13 BST"
> ymd("2017-08-13", tz = "")
[1] "2017-08-13 BST"
> force_tz(as.POSIXct("2017-08-13", tz = ""), "UTC")
[1] "2017-08-13 UTC"

I wasn't expecting the last line to work, so I'm now thoroughly confused...

@vspinu
Copy link
Member

vspinu commented Feb 22, 2018

This is strange indeed. 1st and 3rd indicate that lubridate's internal and R's timezone's match so there should be no problem.

Could it be the funky "Etc/GMT" time zone string which you use? I would really need a minimal example here to understand what's going on.

@vspinu
Copy link
Member

vspinu commented Feb 22, 2018

The underlying cause is probably the same as in #642 and #643.

@vspinu
Copy link
Member

vspinu commented Feb 22, 2018

Could you please check the master on your vector? Thanks!

@tomwwagstaff
Copy link
Author

Hey there, sorry - I haven't checked back on GitHub all week - apologies for the delay.

And, second apology, I don't understand what you mean by the master on my vector?

@vspinu
Copy link
Member

vspinu commented Feb 28, 2018

I meant to check the github master with devtools::install_github("tydiverse/lubridate").

@leobarlach
Copy link

leobarlach commented Mar 20, 2018

Hi, I'm having a similar problem, with DST not being consistently applied.

An example:

Time <- c(as.POSIXct('2017-11-05 01:00:00',tz = 'US/CENTRAL'),
       as.POSIXct('2017-11-05 01:15:00',tz = 'US/CENTRAL'),
       as.POSIXct('2017-11-05 01:30:00',tz = 'US/CENTRAL'),
       as.POSIXct('2017-11-05 01:45:00',tz = 'US/CENTRAL'),
       as.POSIXct('2017-11-05 01:00:00',tz = 'US/CENTRAL'),
       as.POSIXct('2017-11-05 01:15:00',tz = 'US/CENTRAL'),
       as.POSIXct('2017-11-05 01:30:00',tz = 'US/CENTRAL'),
       as.POSIXct('2017-11-05 01:45:00',tz = 'US/CENTRAL'))

repeatedHourFlag <- c(F,F,F,F,T,T,T,T)
timeSeries <- data.table(Time,repeatedHourFlag)

timeSeries[, Time := lubridate::with_tz(Time,tzone = 'UTC')]
timeSeries

The data I receive comes with local time stamps, but a column for repeated hour in case of DST. Ideally, I would convert to UTC and add one hour for the repeated hours. But 1 am through 1:30 converts correctly, but 1:45 converts with one hour more.

@vspinu
Copy link
Member

vspinu commented Mar 22, 2018

By the nature of the problem one cannot know if the hour is repeated or not without extra flag. You seem to have such a flag so you could simply add an hour yourself:

timeSeries[repeatedHourFlag == T, Time := Time + hours(1)]

BTW, on my system US/CENTRAL is not a valid time zone. Results are unpredictable with base R and you don't get the warning. With lubridate's functions you will get an error:

 > time <- ymd_hms(c('2017-11-05 01:00:00',
+                   '2017-11-05 01:15:00',
+                   '2017-11-05 01:30:00',
+                   '2017-11-05 01:45:00',
+                   '2017-11-05 01:00:00',
+                   '2017-11-05 01:15:00',
+                   '2017-11-05 01:30:00',
+                   '2017-11-05 01:45:00'),
+                 tz = "US/CENTRAL")
Error in C_force_tz(time, tz = tzone, roll) : 
  CCTZ: Unrecognized output timezone: "US/CENTRAL"

> ymd_hms(c('2017-11-05 01:00:00',
+           '2017-11-05 01:15:00',
+           '2017-11-05 01:30:00',
+           '2017-11-05 01:45:00',
+           '2017-11-05 01:00:00',
+           '2017-11-05 01:15:00',
+           '2017-11-05 01:30:00',
+           '2017-11-05 01:45:00'),
+         tz = "America/New_York")
[1] "2017-11-05 01:00:00 EST" "2017-11-05 01:15:00 EST"
[3] "2017-11-05 01:30:00 EST" "2017-11-05 01:45:00 EST"
[5] "2017-11-05 01:00:00 EST" "2017-11-05 01:15:00 EST"
[7] "2017-11-05 01:30:00 EST" "2017-11-05 01:45:00 EST"

@vspinu vspinu closed this as completed Apr 10, 2018
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue May 30, 2021
Version 1.7.10
==============

### NEW FEATURES

* `fast_strptime()` and `parse_date_time2()` now accept multiple formats and apply them in turn

### BUG FIXES

* [#926](tidyverse/lubridate#926) Fix incorrect division of intervals by months involving leap years
* Fix incorrect skipping of digits during parsing of the `%z` format

Version 1.7.9.2
===============

### NEW FEATURES

* [#914](tidyverse/lubridate#914) New `rollforward()` function
* [#928](tidyverse/lubridate#928) On startup lubridate now resets TZDIR to a proper directory when it is set to non-dir values like "internal" or "macOS" (a change introduced in R4.0.2)
* [#630](tidyverse/lubridate#630) New parsing functions `ym()` and `my()`

### BUG FIXES

* [#930](tidyverse/lubridate#930) `as.period()` on intervals now returns valid Periods with double fields (not integers)



Version 1.7.9
=============

### NEW FEATURES

* [#871](tidyverse/lubridate#893) Add `vctrs` support


### BUG FIXES

* [#890](tidyverse/lubridate#890) Correctly compute year in `quarter(..., with_year = TRUE)`
* [#893](tidyverse/lubridate#893) Fix incorrect parsing of abbreviated months in locales with trailing dot (regression in v1.7.8)
* [#886](tidyverse/lubridate#886) Fix `with_tz()` for POSIXlt objects
* [#887](tidyverse/lubridate#887) Error on invalid numeric input to `month()`
* [#889](tidyverse/lubridate#889) Export new dmonth function

Version 1.7.8
=============

### NEW FEATURES

* (breaking) Year and month durations now assume 365.25 days in a year consistently in conversion and constructors. Particularly `dyears(1) == years(1)` is now `TRUE`.
* Format and print methods for 0-length objects are more consistent.
* New duration constructor `dmonths()` to complement other duration constructors.
*
* `duration()` constructor now accepts `months` and `years` arguments.
* [#629](tidyverse/lubridate#629) Added `format_ISO8601()` methods.
* [#672](tidyverse/lubridate#672) Eliminate all partial argument matches
* [#674](tidyverse/lubridate#674) `as_date()` now ignores the `tz` argument
* [#675](tidyverse/lubridate#675) `force_tz()`, `with_tz()`, `tz<-` convert dates to date-times
* [#681](tidyverse/lubridate#681) New constants `NA_Date_` and `NA_POSIXct_` which parallel built-in primitive constants.
* [#681](tidyverse/lubridate#681) New constructors `Date()` and `POSIXct()` which parallel built-in primitive constructors.
* [#695](tidyverse/lubridate#695) Durations can now be compared with numeric vectors.
* [#707](tidyverse/lubridate#707) Constructors return 0-length inputs when called with no arguments
* [#713](tidyverse/lubridate#713) (breaking) `as_datetime()` always returns a `POSIXct()`
* [#717](tidyverse/lubridate#717) Common generics are now defined in `generics` dependency package.
* [#719](tidyverse/lubridate#719) Negative Durations are now displayed with leading `-`.
* [#829](tidyverse/lubridate#829) `%within%` throws more meaningful messages when applied on unsupported classes
* [#831](tidyverse/lubridate#831) Changing hour, minute or second of Date object now yields POSIXct.
* [#869](tidyverse/lubridate#869) Propagate NAs to all internal components of a Period object

### BUG FIXES

* [#682](tidyverse/lubridate#682) Fix quarter extraction with small `fiscal_start`s.
* [#703](tidyverse/lubridate#703) `leap_year()` works with objects supported by `year()`.
* [#778](tidyverse/lubridate#778) `duration()/period()/make_difftime()` work with repeated units
* `c.Period` concatenation doesn't fail with empty components.
* Honor `exact = TRUE` argument in `parse_date_time2`, which was so far ignored.

Version 1.7.4
=============

### NEW FEATURES

* [#658](tidyverse/lubridate#658) `%within%` now accepts a list of intervals, in which case an instant is checked if it occurs within any of the supplied intervals.

### CHANGES

* [#661](tidyverse/lubridate#661) Throw error on invalid multi-unit rounding.
* [#633](tidyverse/lubridate#633) `%%` on intervals relies on `%m+` arithmetic and doesn't produce NAs when intermediate computations result in non-existent dates.
* `tz()` always returns "UTC" when `tzone` attribute cannot be inferred.

### BUG FIXES

* [#664](tidyverse/lubridate#664) Fix lookup of period functions in `as.period`
* [#649](tidyverse/lubridate#664) Fix system timezone memoization

Version 1.7.3
=============

### BUG FIXES

* [#643](tidyverse/lubridate#643), [#640](tidyverse/lubridate#640), [#645](tidyverse/lubridate#645) Fix faulty caching of system timezone.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants