Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix various bugs in RareLabelEncoder #665

Merged
merged 2 commits into from
Apr 30, 2023
Merged

Conversation

solegalli
Copy link
Collaborator

closes #651

@solegalli
Copy link
Collaborator Author

Hey @ClaudioSalvatoreArcidiacono

This PR should resolve the issue of rare labels not working with column transformer.

I tested it, it worked fine for me.

Would you like to double check before I merge?

@solegalli
Copy link
Collaborator Author

Mess was introduced in last version when allowing it to work with NAN. Lots of tiny nasty bugs here and there :_(

@ClaudioSalvatoreArcidiacono
Copy link
Contributor

Hey @ClaudioSalvatoreArcidiacono

This PR should resolve the issue of rare labels not working with column transformer.

I tested it, it worked fine for me.

Would you like to double check before I merge?

Hey @solegalli! thanks a lot for picking up this issue!

May I suggest to add a unit test checking that the piece of code that I had in the issue page (or something equivalent) works as expected ?

@solegalli
Copy link
Collaborator Author

solegalli commented Apr 30, 2023

Hey @ClaudioSalvatoreArcidiacono

Thank you for the suggestion. I thought of it. but our transformers are not designed to work with the ColumnTransformer. They do naturally because we try to be as compatible as possible with the sklearn API.

So I'd rather not include that test. Cheers

@solegalli solegalli merged commit b274c95 into main Apr 30, 2023
@solegalli solegalli deleted the rare_labels_bug_na_ignore branch April 30, 2023 09:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RareLabelEncoder with missing_values="ignore" does not work properly with sklearn.compose.ColumnTransformer
2 participants