-
Notifications
You must be signed in to change notification settings - Fork 985
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regression in unique.data.table() as of data.table 1.12.0 #3332
Comments
Thanks for the thorough report! |
Thanks @jameslamb! In 1.11.8, you also get the error if the uniqueness is not resolved before the
In that example I changed v1.12.0+ checks types of the all columns up front before starting the algorithm. Doing this check up front makes things easier internally for parallelism. To make it work and exclude the
I'll add something to NEWS and a suggestion to the error message ... |
@mattdowle ah ok, makes sense! Thanks for taking a look and for clarifying in the error message. |
#
Minimal reproducible example
I believe there's been a regression in
data.table
in version 1.12.0 relative to 1.11.8. TL;DRI think the implementation of
unique.data.table()
has changed and that the new implementation doesn't support complex types like lists in columns.Some users of uptasticsearch have reported receiving an error message from our code that looks like this:
After investigating tonight, I found the source of the issue and can reproduce it. I believe the behavior of
unique()
has changed and I'd consider that change a regression.On 1.12.0:
Raises error
To downgrade to the previous release, I ran the following from the command line:
Once I had v 1.11.8 installed, I re-ran the R code above
Works as expected and returns:
So I went to the blame for unique.data.table() to see what had . changed. Looked like no substantive changes have been made between 1.11.8 and now, so then I thought to look at the blame for
forderv()
.I didn't see anything meaningful in the blame for forderv() either.
I decided to try one more thing...searching for the text "not yet supported" (from the error message). That led me to
forder.c
, whose blame led me to #3124.As far as I can tell, this PR is the source of the problem above. There is no description on the PR so I'm not sure if this is an unintended side effect or a known regression that will be fixed in a future release of
data.table
.#
Output of sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.5.0 tools_3.5.0 yaml_2.2.0 data.table_1.12.0
The text was updated successfully, but these errors were encountered: