You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
because we aren’t perfect, we need an argument in control_workflow() to overwrite options.
all of this shouldn’t matter whether the tibble contains sparse vectors or not. as we will go off the sparsity.
ID
recipe produce sparsity
sparsity
model support
control arg
1
yes
high
yes
auto
2
yes
high
yes
dense
3
yes
high
yes
sparse
4
yes
high
no
auto
5
yes
high
no
dense
6
yes
high
no
sparse
7
yes
low
yes
auto
8
yes
low
yes
dense
9
yes
low
yes
sparse
10
yes
low
no
auto
11
yes
low
no
dense
12
yes
low
no
sparse
13
no
high
yes
auto
14
no
high
yes
dense
15
no
high
yes
sparse
16
no
high
no
auto
17
no
high
no
dense
18
no
high
no
sparse
19
no
low
yes
auto
20
no
low
yes
dense
21
no
low
yes
sparse
22
no
low
no
auto
23
no
low
no
dense
24
no
low
no
sparse
recipe produce sparse means that it contains a recipe step with sparse argument.
sparsity means that there is a lot of sparsity in the data.
model support the parsnip model supports sparse data, e.i. allow_sparse_x = TRUE .
control arg is what is specified in control_workflow().
I think all the above combinations should be tested. In general:
if control is set to "sparse", then recipe should be updates to set sparse = "yes" in steps that has sparse = "auto"and data should be converted to dcgmatrix before being passed to engine model.
if control is set to "dense", then recipe should be updates to set sparse = "no" in steps that has sparse = "auto" and data should not be converted to dcgmatrix before being passed to engine model.
What should happen if control arg is "auto" are listed below.
if the model doesn’t support sparsity, then don’t give it sparse data, and stop recipes from creating sparsity, regardless of how sparse the data is
if sparsity is high and the model supports it, give it sparse data
if sparsity is low and the model supports sparse data, don’t give it sparse data, and make sure that the recipe doesn’t produce sparse data
The text was updated successfully, but these errors were encountered:
because we aren’t perfect, we need an argument in
control_workflow()
to overwrite options.all of this shouldn’t matter whether the tibble contains sparse vectors or not. as we will go off the sparsity.
recipe produce sparse
means that it contains a recipe step withsparse
argument.sparsity
means that there is a lot of sparsity in the data.model support
the parsnip model supports sparse data, e.i.allow_sparse_x = TRUE
.control arg
is what is specified incontrol_workflow()
.I think all the above combinations should be tested. In general:
if control is set to
"sparse"
, then recipe should be updates to setsparse = "yes"
in steps that hassparse = "auto"
and data should be converted todcgmatrix
before being passed to engine model.if control is set to
"dense"
, then recipe should be updates to setsparse = "no"
in steps that hassparse = "auto"
and data should not be converted todcgmatrix
before being passed to engine model.What should happen if control arg is
"auto"
are listed below.The text was updated successfully, but these errors were encountered: