-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: Unable to create node 'LabelEncoder' with name='LabelEncoder3' and attributes={'keys_floats': array([False, True]), 'values_int64s': array([0, 1])}. #1047
Comments
Did you try |
@xadupre, yes, I tried. I've just tried one more time and got exactly the same error (I compared it to the output above using www.diffchecker.com and the only difference is on the screenshot). |
I created PR #1049 to look into your issue but I can't replicate it. I assume the test I used is different from yours. Could you let me know what is different? |
@xadupre, thank you for the tests! I looked at the tests and they gave me an idea of what is wrong. I haven't noticed that
but the error persists:
What am I missing? It seems there is a problem when converting the "boolean" column |
booleans are not supported by onnx LabelEncoder (see https://onnx.ai/onnx/operators/onnx_aionnxml_LabelEncoder.html#l-onnx-docai-onnx-ml-labelencoder). You should convert them into int64 before calling the converter. |
Oh, I didn't know this...
Sorry, I don't understand. Should I convert the entire column before training? Or can I just convert the provided And why does the |
You can use boolean for training but the input schema for the converter must be integers and onnxruntime will expect integers as well when running inference. |
Ah, I see. It's more coherent to convert the entire dataset before the training then. I did this and
has worked correctly:
Thank you for your help! Closing the issue. |
Oh, no, there is another error when I try to load the onnx model.
gives
What can be wrong with converting the LabelEncoder?.. The initial types are
|
Based on the error message, I assume one input is expected to be an integer by the LabelEncoder but it is not. |
Sorry, I don't know onnx well, but at what point is something send to the LabelEncoder? I do not provide any data. If
|
Could you use a tool like netron to look at your model and search for node LabelEncoder5? It should show you the data it processes and lead you to the input it is connected to. |
The concat node is casting every column to a single type. I assume it is string. So every label encoder is expecting string input. It seems skl2onnx is unable to handle this scenario. It usually follows sikit-learn implementation. Sometimes, we do not test it against all cases, sometimes scikit-learn is changing its implementation. You may try the latest version released today to see if it fixes it. Otherwise, I suggest splitting the ordinal encoder into 2, one for strings columns, another one for integers. It should not impact your pipeline but the converted models should have two concat nodes, one for strings, another one for integers. |
Oh, it's strange that this is done without any warning... Thank you for pointing it out!
I've tried 1.16.0 and the RE has gone! But I've tried to mimic your tests and got too much difference between the predictions of the original and the onnx versions of the model. Why can this be? The code is:
output:
|
I'll need to get the full example to understand what is going on (without the data). What is RE? |
Oops, sorry, I used the wrong abbreviation. I meant the error:
The repository is public and contains little code, since it is designed for educational purposes, but I'm not sure you have time to figure out even this code. The above code is just a modification of the 68th line. I'm running the
|
If you could create a failed unittest like in this file https://github.com/onnx/sklearn-onnx/blob/main/tests/test_issues_2024.py, it would save me some time. |
I can try! |
Thanks :) |
I decided to create an example on Google's Colab as a first attempt, since you may have some improvement remarks. Moreover, if I don't save and then load the dataset, I obtain a strange error:
produces
whereas with
everything works fine... If you know how to fix this, I can create a PR with adjusted code added as a unit test. |
@xadupre, have you looked at the notebook? Have you observed the same behaviour? |
@xadupre, have you managed to take a look at the notebook? |
Hi there!
I'm trying to convert the sklearn's
Pipeline
to the ONNX format, but I get a strange error. ThePipeline
is the following:The dataset is obtained via
fetch_openml("Bike_Sharing_Demand", version=2, as_frame=True, parser="pandas")
and little preprocessing (e.g. the "heavy_rain" category is merged with the "rain" category). The result ofprint(X_train.iloc[:1])
isWhen I run
model_onnx = to_onnx(model, X=X_train.iloc[:1], verbose=1)
, I get an error in the title. The type guessing seems to work fine:The remaining part of the output is:
The
skl2onnx
version from the poetry.lock is:UPD:
I've also decided to leave here the python and scikit-learn versions:
The text was updated successfully, but these errors were encountered: