-
Notifications
You must be signed in to change notification settings - Fork 985
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Date and POSIXct coerced to numeric when calculating median by group #3079
Comments
re-run with verbose = TRUE
…On Fri, Sep 28, 2018, 12:48 AM Henrik-P ***@***.***> wrote:
I have data with class Date ('date') and POSIXct ('time'), and a grouping
variable 'g'
d <- data.table(
date = as.Date(c("2018-01-01", "2018-01-03", "2018-01-08",
"2018-01-10", "2018-01-25", "2018-01-30")),
g = rep(letters[1:2], each = 3))
d[ , time := as.POSIXct(date)]
When calculating median of 'date' and 'time' by group, the result is
coerced to numeric:
d[ , median(date), by = g]
# g V1
# 1: a 17534
# 2: b 17556
d[ , median(time), by = g]
# g V1
# 1: a 1514937600
# 2: b 1516838400
------------------------------
However, 'date' and 'time' is *not* coerced when calculating median
*without* grouping:
d[ , median(date)]
# [1] "2018-01-09"
d[ , median(time)]
# [1] "2018-01-09 01:00:00 CET"
------------------------------
Other things I've tried which don't coerce:
Mean 'date' and 'time' by group:
d[ , mean(date), by = g]
# g V1
# 1: a 2018-01-04
# 2: b 2018-01-21
d[ , mean(time), by = g]
# g V1
# 1: a 2018-01-04 01:00:00
# 2: b 2018-01-21 17:00:00
Median 'date' and 'time' by group using aggregate:
aggregate(date ~ g, data = d, median)
# g date
# 1 a 2018-01-03
# 2 b 2018-01-25
aggregate(time ~ g, data = d, median)
# g time
# 1 a 2018-01-03 01:00:00
# 2 b 2018-01-25 01:00:00
------------------------------
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server >= 2012 x64 (build 9200)
data.table_1.11.6
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#3079>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHQQdQkQW1TkuszM-3S7lwHoviAOkinEks5ufQFNgaJpZM4W9Dmz>
.
|
Thanks Michael. Here we go:
Detected that j uses these columns: date
Detected that j uses these columns: time |
Thanks... I suspected For now, if you're just interested in moving on, you can temporarily disable
|
Thanks a lot for your rapid response. I think this may be a regression - as far as I recall this used to work in earlier versions. |
It certainly sounds familiar... there may be an outstanding issue... |
Very similar to #1876 |
Another bug, maybe related -- gmedian is coercing integers to reals, in contrast with base
I guess one could argue that the base median behavior is wrong (since return type is unpredictable) |
Very rarely I'm against consistency with base R but in this particular case of gmedian I prefer to have double returned always. |
I have data with class
Date
('date') andPOSIXct
('time'), and a grouping variable 'g'When calculating median of 'date' and 'time' by group, the result is coerced to numeric:
However, 'date' and 'time' is not coerced when calculating median without grouping:
Other things I've tried which don't coerce:
Mean 'date' and 'time' by group:
Median 'date' and 'time' by group using
aggregate
:R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server >= 2012 x64 (build 9200)
data.table_1.11.6
The text was updated successfully, but these errors were encountered: