You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To make sure the test columns line up with the training columns, we can run:
# Match column order from X_train to df_test (to predict on columns, they should be in the same order they were fit on)df_test=df_test[X_train.columns]
This line will make sure the columns of df_test match the order of the columns of X_train.
And then:
# Make predictions on the test dataset using the best modeltest_preds=ideal_model.predict(df_test)
--
A big thank you to @arpadikuma for the pull request to update this, see #61.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Problem Example
As of Scikit-Learn 1.2+, the columns (features) a model has been fit on, should match the columns (features) a model is trying to predict on.
This goes for both names of columns as well as order of columns.
For example, if the training columns are:
And the testing columns are:
Running
model.fit()
on the training data and thenmodel.predict()
on the testing data will error.To fix this, you can change the order of the test columns to match the order of the training columns.
For example, in
end-to-end-bluebook-bulldozer-price-regression.ipynb
, our training columns go under a fair bit of modification.Code Fix
To make sure the test columns line up with the training columns, we can run:
This line will make sure the columns of
df_test
match the order of the columns ofX_train
.And then:
--
A big thank you to @arpadikuma for the pull request to update this, see #61.
This code has been added to
end-to-end-bluebook-bulldozer-price-regression.ipynb
in the section "Preprocessing the test data".Beta Was this translation helpful? Give feedback.
All reactions