Triggered internal error with an expression in j while grouping #3708

MichaelChirico · 2019-07-18T03:07:13Z

Breaking things trying to do coverage and hit an "internal" error:

data.table(e = expression(1), c = '1')[ , e, by = c]

Error in `[.data.table`(data.table(e = expression(1), c = "1"), , e, by = c) : 
  Internal error: column type 'expression' not supported by data.table subset. All known types are supported so please report as bug.

Of course this should stay an error, but handled differently.

The text was updated successfully, but these errors were encountered:

udaydatar7 · 2019-12-25T05:35:01Z

@MichaelChirico what do you mean by handled?

jangorecki · 2019-12-25T07:02:31Z

This should not be an internal error. Should be caught sooner than it currently is. Expressions should be wrapped into list to be stored as a column.

…or unpacked in data.table

tlapak · 2020-03-06T19:53:49Z

I've been poking around a bit and this isn't directly related to the grouping:

data.table(e = expression(1))[1]
# Error in `[.data.table`(data.table(e = expression(1)), 1) : 
#  Internal error: column type 'expression' not supported by data.table subset. All known types are supported so please report as bug.

The printout with a multi-row expression column is also weird:

data.table(e = expression(1, 2), c=c('1', '2'))
#                   e      c
#              <expr> <char>
# 1: expression(1, 2)      1
# 2: expression(1, 2)      2

You get the same issues with columns of type language:

data.table(e=as.call(list(sum, 1)), c='1')[1]
# Error in `[.data.table`(data.table(e = as.call(list(sum, 1)), c = "1"),  : 
#  Internal error: column type 'language' not supported by data.table subset. All known types are supported so please report as bug.

data.table(e=as.call(list(sum, 1)), c='1')
#                       e      c
#                  <call> <char>
# 1: .Primitive("sum")(1)      1
# 2: .Primitive("sum")(1)      1

Which seems extra unintuitive to me.
You also can't print or subset data.tables that have a column of type name:

data.table(e=as.name('a'), c='1')
# Error in format.default(char.trunc(col), justify = justify, ...) : 
#  Found no format() method for class "name" 

data.table(e=as.name('a'), c='1')[1]
# Error in `[.data.table`(data.table(e = as.name("a"), c = "1"), 1) : 
#  invalid type/length (symbol/1) in vector allocation

Type environment can also get funky:

data.table(e=environment())
# Error in dimnames(x) <- dn : 
#  length of 'dimnames' [1] not equal to array extent

I would also generally expect that data.table(something) == data.table(c(something)) == data.table(c(something, something_else))[1] unless explicitly stated otherwise.

From what @jangorecki said, and the work in #4196, it seems to me that the default should be to wrap anything that's not atomic (or a data.frame or matrix) in a list column. Maybe with a warning?
I have a commit here that achieves this: f1ca3f3.

It makes a few tests redundant. It would also make much of the work in #4196 redundant, as far as I can tell. Or are there other ways to get a data.table with non-atomic columns? It would also close #4040 and #546 as won't fix, I guess.

I'll prepare the pull request if people think this is the way to go.

MichaelChirico added beginner-task internals Low labels Jul 18, 2019

tlapak added a commit to tlapak/data.table that referenced this issue Mar 6, 2020

Fix to Rdatatable#3708, all non-atomic types are now wrapped in list …

f1ca3f3

…or unpacked in data.table

jangorecki added non-atomic column e.g. list columns, S4 vector columns and removed beginner-task labels Apr 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Triggered internal error with an expression in j while grouping #3708

Triggered internal error with an expression in j while grouping #3708

MichaelChirico commented Jul 18, 2019

udaydatar7 commented Dec 25, 2019

jangorecki commented Dec 25, 2019 •

edited

Loading

tlapak commented Mar 6, 2020

Triggered internal error with an expression in j while grouping #3708

Triggered internal error with an expression in j while grouping #3708

Comments

MichaelChirico commented Jul 18, 2019

udaydatar7 commented Dec 25, 2019

jangorecki commented Dec 25, 2019 • edited Loading

tlapak commented Mar 6, 2020

jangorecki commented Dec 25, 2019 •

edited

Loading