You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Merging two data.tables where one data.table (or both) has a keyed column containing only NA_character_'s produces a segfault and crashes the R session.
With valgrind enabled, using the current data.table development version (1.14.1), the above code returns:
==10795== Use of uninitialised value of size 8
==10795== at 0x4FB7910: LEVELS (in /usr/lib/R/lib/libR.so)
==10795== by 0x101824AA: issorted (in /home/jchau/R/x86_64-pc-linux-gnu-library/4.0/data.table/libs/datatable.so)
==10795== by 0x4F352AB: ??? (in /usr/lib/R/lib/libR.so)
==10795== by 0x4F7540B: ??? (in /usr/lib/R/lib/libR.so)
==10795== by 0x4F7F66F: Rf_eval (in /usr/lib/R/lib/libR.so)
==10795== by 0x4F8148E: ??? (in /usr/lib/R/lib/libR.so)
==10795== by 0x4F82256: Rf_applyClosure (in /usr/lib/R/lib/libR.so)
==10795== by 0x4F76908: ??? (in /usr/lib/R/lib/libR.so)
==10795== by 0x4F7F66F: Rf_eval (in /usr/lib/R/lib/libR.so)
==10795== by 0x4F8148E: ??? (in /usr/lib/R/lib/libR.so)
==10795== by 0x4F82256: Rf_applyClosure (in /usr/lib/R/lib/libR.so)
==10795== by 0x4FC5362: ??? (in /usr/lib/R/lib/libR.so)
==10795== Uninitialised value was created by a heap allocation
==10795== at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==10795== by 0x4FBE353: ??? (in /usr/lib/R/lib/libR.so)
==10795== by 0x4FBFE81: Rf_allocVector3 (in /usr/lib/R/lib/libR.so)
==10795== by 0x4F892F7: R_bcEncode (in /usr/lib/R/lib/libR.so)
==10795== by 0x50215C6: ??? (in /usr/lib/R/lib/libR.so)
==10795== by 0x502163F: ??? (in /usr/lib/R/lib/libR.so)
==10795== by 0x502095C: ??? (in /usr/lib/R/lib/libR.so)
==10795== by 0x501FCA9: ??? (in /usr/lib/R/lib/libR.so)
==10795== by 0x5021A2D: R_Unserialize (in /usr/lib/R/lib/libR.so)
==10795== by 0x5022DC9: ??? (in /usr/lib/R/lib/libR.so)
==10795== by 0x5023200: ??? (in /usr/lib/R/lib/libR.so)
==10795== by 0x4F7FBF5: Rf_eval (in /usr/lib/R/lib/libR.so)
==10795==
==10795== Invalid read of size 2
==10795== at 0x4FB7910: LEVELS (in /usr/lib/R/lib/libR.so)
==10795== by 0x101824AA: issorted (in /home/jchau/R/x86_64-pc-linux-gnu-library/4.0/data.table/libs/datatable.so)
==10795== by 0x4F352AB: ??? (in /usr/lib/R/lib/libR.so)
==10795== by 0x4F7540B: ??? (in /usr/lib/R/lib/libR.so)
==10795== by 0x4F7F66F: Rf_eval (in /usr/lib/R/lib/libR.so)
==10795== by 0x4F8148E: ??? (in /usr/lib/R/lib/libR.so)
==10795== by 0x4F82256: Rf_applyClosure (in /usr/lib/R/lib/libR.so)
==10795== by 0x4F76908: ??? (in /usr/lib/R/lib/libR.so)
==10795== by 0x4F7F66F: Rf_eval (in /usr/lib/R/lib/libR.so)
==10795== by 0x4F8148E: ??? (in /usr/lib/R/lib/libR.so)
==10795== by 0x4F82256: Rf_applyClosure (in /usr/lib/R/lib/libR.so)
==10795== by 0x4FC5362: ??? (in /usr/lib/R/lib/libR.so)
==10795== Address 0x1000000010001 is not stack'd, malloc'd or (recently) free'd
==10795==
*** caught segfault ***
address (nil), cause 'unknown'
Traceback:
1: is.sorted(jval, by = key(x))
2: `[.data.table`(dt3, , .(x1, x2))
3: dt3[, .(x1, x2)]
Session info
R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.14.1
loaded via a namespace (and not attached):
[1] compiler_4.0.2
Note
Using an explicit merge does work as expected:
library(data.table)
dt1<- data.table(x1= rep(letters[1:4], each=3), x2=NA_character_)
dt2<- data.table(x1=letters[1:3])
setkey(dt1, x2)
dt3<- merge(dt1, dt2, by="x1")
dt3[, .(x1, x2)]
#> x1 x2#> 1: a <NA>#> 2: a <NA>#> 3: a <NA>#> 4: b <NA>#> 5: b <NA>#> 6: b <NA>#> 7: c <NA>#> 8: c <NA>#> 9: c <NA>
The text was updated successfully, but these errors were encountered:
JorisChau
changed the title
Segfault merging data.tables with keyed NA columns
Segfault merging data.tables with keyed NA_character_ columns
Jul 9, 2021
Issue
Merging two data.tables where one data.table (or both) has a keyed column containing only
NA_character_
's produces a segfault and crashes the R session.Reproducible example
With valgrind enabled, using the current
data.table
development version (1.14.1), the above code returns:Session info
Note
Using an explicit
merge
does work as expected:The text was updated successfully, but these errors were encountered: