Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check failed: gpu_predictor_ #5592

Closed
rgbinventions opened this issue Apr 24, 2020 · 10 comments
Closed

Check failed: gpu_predictor_ #5592

rgbinventions opened this issue Apr 24, 2020 · 10 comments

Comments

@rgbinventions
Copy link

rgbinventions commented Apr 24, 2020

Hey guys!
I've built xgboost with GPU support. Everything was ok. But when i'm training the classifier model i got several gpu_predictor warnings and then fatal error. Here is the full error list:

XGBoostError                              Traceback (most recent call last)
<ipython-input-5-f9ae4a94f7eb> in <module>
     28 
     29 gsearch = GridSearchCV(estimator=model, param_grid=params_grid, scoring='roc_auc', cv=5)
---> 30 gsearch.fit(X_train, y_train)
     31 
     32 predictions = gsearch.best_estimator_.predict(X_test)

C:\Anaconda3\lib\site-packages\sklearn\model_selection\_search.py in fit(self, X, y, groups, **fit_params)
    737             refit_start_time = time.time()
    738             if y is not None:
--> 739                 self.best_estimator_.fit(X, y, **fit_params)
    740             else:
    741                 self.best_estimator_.fit(X, **fit_params)

C:\Anaconda3\lib\site-packages\xgboost\sklearn.py in fit(self, X, y, sample_weight, base_margin, eval_set, eval_metric, early_stopping_rounds, verbose, xgb_model, sample_weight_eval_set, callbacks)
    822                               evals_result=evals_result, obj=obj, feval=feval,
    823                               verbose_eval=verbose, xgb_model=xgb_model,
--> 824                               callbacks=callbacks)
    825 
    826         self.objective = xgb_options["objective"]

C:\Anaconda3\lib\site-packages\xgboost\training.py in train(params, dtrain, num_boost_round, evals, obj, feval, maximize, early_stopping_rounds, evals_result, verbose_eval, xgb_model, callbacks)
    210                            evals=evals,
    211                            obj=obj, feval=feval,
--> 212                            xgb_model=xgb_model, callbacks=callbacks)
    213 
    214 

C:\Anaconda3\lib\site-packages\xgboost\training.py in _train_internal(params, dtrain, num_boost_round, evals, obj, feval, xgb_model, callbacks)
     73         # Skip the first update if it is a recovery step.
     74         if version % 2 == 0:
---> 75             bst.update(dtrain, i, obj)
     76             bst.save_rabit_checkpoint()
     77             version += 1

C:\Anaconda3\lib\site-packages\xgboost\core.py in update(self, dtrain, iteration, fobj)
   1366             _check_call(_LIB.XGBoosterUpdateOneIter(self.handle,
   1367                                                     ctypes.c_int(iteration),
-> 1368                                                     dtrain.handle))
   1369         else:
   1370             pred = self.predict(dtrain, output_margin=True, training=True)

C:\Anaconda3\lib\site-packages\xgboost\core.py in _check_call(ret)
    187     """
    188     if ret != 0:
--> 189         raise XGBoostError(py_str(_LIB.XGBGetLastError()))
    190 
    191 

XGBoostError: [15:19:15] C:\1\xgboost\src\gbm\gbtree.cc:410: Check failed: gpu_predictor_: 
@trivialfis
Copy link
Member

Could you please provide a reproducible script.

@rgbinventions
Copy link
Author

It happens even with your example:

import xgboost as xgb
import numpy as np
from sklearn.datasets import fetch_covtype
from sklearn.model_selection import train_test_split
import time

Fetch dataset using sklearn

cov = fetch_covtype()
X = cov.data
y = cov.target

Create 0.75/0.25 train/test split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, train_size=0.75,
random_state=42)

Specify sufficient boosting iterations to reach a minimum

num_round = 3000

Leave most parameters as default

param = {'objective': 'multi:softmax', # Specify multiclass classification
'num_class': 8, # Number of possible output classes
'tree_method': 'gpu_hist' # Use GPU accelerated algorithm
}

Convert input data from numpy to XGBoost format

dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)

gpu_res = {} # Store accuracy result
tmp = time.time()

Train model

xgb.train(param, dtrain, num_round, evals=[(dtest, 'test')], evals_result=gpu_res)
print("GPU Training Time: %s seconds" % (str(time.time() - tmp)))

And here is error:


XGBoostError Traceback (most recent call last)
in
30 tmp = time.time()
31 # Train model
---> 32 xgb.train(param, dtrain, num_round, evals=[(dtest, 'test')], evals_result=gpu_res)
33 print("GPU Training Time: %s seconds" % (str(time.time() - tmp)))

C:\Anaconda3\lib\site-packages\xgboost\training.py in train(params, dtrain, num_boost_round, evals, obj, feval, maximize, early_stopping_rounds, evals_result, verbose_eval, xgb_model, callbacks)
210 evals=evals,
211 obj=obj, feval=feval,
--> 212 xgb_model=xgb_model, callbacks=callbacks)
213
214

C:\Anaconda3\lib\site-packages\xgboost\training.py in _train_internal(params, dtrain, num_boost_round, evals, obj, feval, xgb_model, callbacks)
73 # Skip the first update if it is a recovery step.
74 if version % 2 == 0:
---> 75 bst.update(dtrain, i, obj)
76 bst.save_rabit_checkpoint()
77 version += 1

C:\Anaconda3\lib\site-packages\xgboost\core.py in update(self, dtrain, iteration, fobj)
1366 _check_call(_LIB.XGBoosterUpdateOneIter(self.handle,
1367 ctypes.c_int(iteration),
-> 1368 dtrain.handle))
1369 else:
1370 pred = self.predict(dtrain, output_margin=True, training=True)

C:\Anaconda3\lib\site-packages\xgboost\core.py in _check_call(ret)
187 """
188 if ret != 0:
--> 189 raise XGBoostError(py_str(_LIB.XGBGetLastError()))
190
191

XGBoostError: [19:20:38] C:\1\xgboost\src\gbm\gbtree.cc:457: Check failed: gpu_predictor_:

@trivialfis
Copy link
Member

Ah, I can reproduce it when running on a CPU only environment.

@hcho3
Copy link
Collaborator

hcho3 commented Apr 24, 2020

@rgbinventions Does your machine have a working GPU?

@rgbinventions
Copy link
Author

rgbinventions commented Apr 24, 2020

Ah, I can reproduce it when running on a CPU only environment.

Sorry, what exactly i should provide?

@rgbinventions Does your machine have a working GPU?

Yes, and i've built xgboost correctly following instructions without problems.

@trivialfis
Copy link
Member

I suspect there's something wrong with you cuda installation.

@rgbinventions
Copy link
Author

I suspect there's something wrong with you cuda installation.

ok, will try to re-install it. Currently CUDA Toolkit 10.2 btw

@trivialfis
Copy link
Member

trivialfis commented Apr 24, 2020

@rgbinventions Could you run this script if you are using Linux?

echo '''#include <stdio.h>

int main() {
  int n_devices = 0;
  int code = cudaGetDeviceCount(&n_devices);
  if (code != cudaSuccess) {
    printf("Failed to get device count.\n");
    return -1;
  }
  printf("Number of availabe devices: %d\n", n_devices);
}
''' > main.cu
nvcc main.cu -o cuda-devices
./cuda-devices

@trivialfis
Copy link
Member

If it prints more than 0 devices, then your installation is successful.

@rgbinventions
Copy link
Author

Sorry, the problem was with CUDA install as you said. And GPU compute capability 3.0. Can close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants