Large prediction results unless using repartition(1) in databricks with lgbm model #986

user673 · 2021-02-03T22:17:49Z

I'm using mmlspark lgbm model for regression problem and faced strange thing. If using all normal code as in the example, results will be terrible, becouse predictions are huge (around 10^37 , while target is in range from 0 to 200).
Testing, I found that using dataset.repartition(1).cache() fixed this problem,but with one detail - modelling began to take longer (around 1h, while 20m earlier). This is logical since all the data (about 4m rows and 150 columns) is collected before learning in one partition.

I tried changing lgbm param useBarrierExecutionMode to True and different parallelism params, but this changes doesn't affect result.

Is there a way not to use such workaround with repartition and still having normal results?

Code, used for training

      repartitioned_data = data_train.repartition(1).cache() #  want to delete this line

      # Define model
      model = LightGBMRegressor(
          objective='regression',
          labelCol='label',
          featuresCol="features"
      )

      # Define grid params
      paramGrid = ParamGridBuilder() \
              .addGrid(model.numIterations, [100, 250])\
            .build()

      # Define cross validation for grid params
      evaluator = RegressionEvaluator(labelCol="label", predictionCol="prediction", metricName="mae")
      crossval = CrossValidator(estimator=model,
                          estimatorParamMaps=paramGrid,
                          evaluator=evaluator,   
                          numFolds=2)

      # Train model
      pipeline = crossval.fit(data_train)

Databricks Runtime Version 6.4 (includes Apache Spark 2.4.5, Scala 2.11)
3 worker nodes Standard_DS4_v2
driver node Standard_DS4_v2
mmlspark version mmlspark_2.11:1.0.0-rc3

AB#1984587

The text was updated successfully, but these errors were encountered:

welcome · 2021-02-03T22:17:50Z

👋 Thanks for opening your first issue here! If you're reporting a 🐞 bug, please make sure you include steps to reproduce it.

AllardJM · 2021-02-08T16:14:36Z

I am seeing something similar - enormous predictions that are sometimes positive and sometimes negative (there are no negative values in the target). It seems that if I use mse as the objective the predictions are all extremely negative and extremely positive if using a tweedie objective. The rank order / discrimination is relatively good but the estimates are orders of magnitude uncalibrated

Using Data Bricks 7.3 LTS Spark 3.0.1
Lightgbm: com.microsoft.ml.spark:mmlspark_2.12:1.0.0-rc3-24-495af3e4-SNAPSHOT

mhamilton723 · 2021-03-31T17:17:44Z

@AllardJM @user673 Could you possibly share an example dataset for us to repro? Adding @imatiach-msft who built LightGBM on Spark

imatiach-msft · 2021-04-13T09:37:30Z

interesting, I've seen similar issues reported where setting the tree depth or number of leaves seemed to resolve the large predictions, not sure if it's related to the specific issue(s) @user673 and @AllardJM saw

imatiach-msft · 2021-04-15T04:24:53Z

@user673 @AllardJM I've found the last iteration is sometimes creating a bad tree that seems to predict inf values, I've created an issue in lightgbm repo to track this:
microsoft/LightGBM#4178
this seems to happen when I see the following logging:
...
[LightGBM] [Debug] Trained a tree with leaves = 31 and max_depth = 18
[LightGBM] [Debug] Trained a tree with leaves = 31 and max_depth = 16
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 1 and max_depth = 1
[LightGBM] [Warning] Stopped training because there are no more leaves that meet the split requirements

the very last tree with 1 leaf and 1 depth seems to output the high values; limiting the number of iterations to be one less than this seems to prevent the issue

imatiach-msft · 2021-04-15T04:26:13Z

this might be different from what @AllardJM is seeing though since I only see inf values, I don't see negative values, and this is just the LightGBMRegressor with regression objective function

imatiach-msft · 2021-04-19T04:25:28Z

@user673 @AllardJM FYI I believe this issue has been fixed with this PR in lightgbm repository:
microsoft/LightGBM#4185
I am waiting for it to be merged to validate/release it

shiyu1994 · 2021-04-22T02:08:03Z

@imatiach-msft The PR has been merged, please check.

imatiach-msft · 2021-04-30T06:17:55Z

thanks, I've updated code on latest master and have confirmed issue is fixed. I will leave this github issue open since it's pretty bad until the next release, in case others see it, so it's more easy to find.

shsab · 2022-09-07T18:56:09Z

@imatiach-msft Any update on this issue? We are facing the same issue and using the reparation(1) workaround however it is not feasible for large datasets.

adesh14 · 2024-02-20T14:46:51Z

hey,
I am also encountering the same error like [LightGBM] [Warning] No further splits with positive gain, best gain: -inf
my dataset contains 360 rows and 19999 col can you please guide what could be the problem, does setting different value of hyperparameter works here?

imatiach-msft mentioned this issue Apr 15, 2021

[mmlspark] how to deal with "Stopped training because there are no more leaves that meet the split requirements" microsoft/LightGBM#4178

Closed

imatiach-msft mentioned this issue Apr 23, 2021

chore: update to lightgbm 3.2.110 #1029

Merged

imatiach-msft mentioned this issue May 3, 2021

Upgrading to 1.0.0-rc2 results in a large drop in classification performance using LightGBMClassifier. #919

Open

ruixinxu added the area/lightgbm label Sep 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large prediction results unless using repartition(1) in databricks with lgbm model #986

Large prediction results unless using repartition(1) in databricks with lgbm model #986

user673 commented Feb 3, 2021 •

edited by mhamilton723

Loading

welcome bot commented Feb 3, 2021

AllardJM commented Feb 8, 2021

mhamilton723 commented Mar 31, 2021

imatiach-msft commented Apr 13, 2021

imatiach-msft commented Apr 15, 2021

imatiach-msft commented Apr 15, 2021

imatiach-msft commented Apr 19, 2021

shiyu1994 commented Apr 22, 2021

imatiach-msft commented Apr 30, 2021

shsab commented Sep 7, 2022

adesh14 commented Feb 20, 2024

Large prediction results unless using repartition(1) in databricks with lgbm model #986

Large prediction results unless using repartition(1) in databricks with lgbm model #986

Comments

user673 commented Feb 3, 2021 • edited by mhamilton723 Loading

welcome bot commented Feb 3, 2021

AllardJM commented Feb 8, 2021

mhamilton723 commented Mar 31, 2021

imatiach-msft commented Apr 13, 2021

imatiach-msft commented Apr 15, 2021

imatiach-msft commented Apr 15, 2021

imatiach-msft commented Apr 19, 2021

shiyu1994 commented Apr 22, 2021

imatiach-msft commented Apr 30, 2021

shsab commented Sep 7, 2022

adesh14 commented Feb 20, 2024

user673 commented Feb 3, 2021 •

edited by mhamilton723

Loading