Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lightgbm model doesn't see sample_size parameter from boost_tree #30

Closed
bwilkowski opened this issue Jul 4, 2022 · 4 comments · Fixed by tidymodels/parsnip#768 or #32
Closed

Comments

@bwilkowski
Copy link

While I was testing bonsai package I noticed some odd behaviour regarding tuning parameters. When I indicated which parameters should be tuned via boost_tree I saw that lighgbm did not use sample_size. Is that correct behaviour?
I used following code to configure model.

pre_proc <-
recipe(status ~ ., data = data_train) %>%
update_role(cif, data_alertu, okres_alertu, new_role = "id variable") %>%
step_zv(all_predictors()) %>%
step_corr(all_predictors(), threshold = .9)

gbm_mod <-
boost_tree(
mode = "classification",
trees = tune(),
tree_depth = tune(),
min_n = tune(),
loss_reduction = tune(),
sample_size = tune(),
mtry = tune(),
learn_rate = tune()
) %>%
set_engine("lightgbm")

gbm_wflow <-
workflow() %>%
add_model(gbm_mod) %>%
add_recipe(pre_proc)

gbm_set <- extract_parameter_set_dials(gbm_wflow) %>% update(mtry = mtry(c(10,50)))

gbm_set returned only 6 parameters, although I set 7 to be tuned. I saw that sample_size is missing. Shouldn't it be translated to lightgbm's bagging_fraction parameter?

@simonpcouch

This comment was marked as off-topic.

@bwilkowski

This comment was marked as off-topic.

@simonpcouch
Copy link
Contributor

simonpcouch commented Jul 6, 2022

Ah, my apologies! This is indeed an issue.

A quick reprex to jump off from when I have a moment for this:

library(tidymodels)
library(bonsai)
data(penguins, package = "modeldata")

bt <-
  boost_tree(sample_size = tune()) %>%
  set_engine(engine = "lightgbm") %>%
  set_mode(mode = "classification")

grid <-
  tune_grid(
    bt,
    species ~ flipper_length_mm + island,
    bootstraps(penguins)
  )
#> Warning: No tuning parameters have been detected, performance will be evaluated
#> using the resamples with no tuning. Did you want to [tune()] parameters?

Created on 2022-07-06 by the reprex package (v2.0.1)

The usage of sample_prop is only internal and need not be user-facing—just helps us come up with reasonable values to tune over. Looks like parsnip has a way of handling this that I'll either need to PR over there for or emulate in bonsai.

I'm focused elsewhere at the moment, but a fix for this will definitely be included in the next release of the package. :)

Thanks again for the report.

@github-actions
Copy link

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Jan 11, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
2 participants