You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The .predict() method of xgboost.dask.DaskXGBClassifier currently returns probabilities. Per the specification, the .predict() method is supposed to return class labels. This is also inconsistent with the behavior of the .predict() method of xgboost.XGBClassifier, which properly returns class labels.
Other functionality in dask (specifically in dask_ml.model_selection) depend on the behavior being correct.
Example of correct behavior in xgboost.XGBClassifier:
import xgboost as xgb
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=1000, n_informative=5, n_classes=2, random_state=1234)
clf = xgb.XGBClassifier(objective="binary:logistic")
clf.fit(X, y)
print(clf.predict(X)[:5])
# [0 0 1 1 1]
Example of incorrect behavior in xgboost.dask.DaskXGBClassifier:
Sklearn Specification for Classifiers
The
.predict()
method ofxgboost.dask.DaskXGBClassifier
currently returns probabilities. Per the specification, the.predict()
method is supposed to return class labels. This is also inconsistent with the behavior of the.predict()
method ofxgboost.XGBClassifier
, which properly returns class labels.Other functionality in
dask
(specifically indask_ml.model_selection
) depend on the behavior being correct.Example of correct behavior in
xgboost.XGBClassifier
:Example of incorrect behavior in
xgboost.dask.DaskXGBClassifier
:The text was updated successfully, but these errors were encountered: