-
Notifications
You must be signed in to change notification settings - Fork 377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] TextClassification
dataset displays the labels in wrong order
#3828
Labels
Milestone
Comments
gabrielmbmb
added
type: bug
Indicates an unexpected problem or unintended behavior
area: api
Indicates that an issue or pull request is related to the Fast API server or REST endpoints
labels
Sep 26, 2023
@frascuchon do we want to fix this or just allow for this in the |
I think @gabrielmbmb fixed this for a community user..please confirm |
@gabrielmbmb please can you confirm that this is already solved? If that's the case please close the issue. |
@gabrielmbmb, this might be an issue in the TokenClassification dataset due in terms of visualizing the label order in the UI. |
7 tasks
gabrielmbmb
added a commit
that referenced
this issue
Nov 28, 2023
# Description This PR fixes a bug where the order of the labels for a Text Classification dataset provided in the class `TextClassificationSettings` was not preserved. This was happening because the `labels_schema` attribute had the `Set[str]` type to ensure there is no duplicate labels, but `set` doesn't preserver the order. Instead of using `set` to ensure there is no duplicates, `labels_schema` now has the `List[str]` type and a basic for loop has been added to ensure there is no duplicates. Closes #3828 **Type of change** - [x] Bug fix (non-breaking change which fixes an issue) **How Has This Been Tested** ```python import argilla as rg rg.set_workspace("argilla") settings = rg.TextClassificationSettings(label_schema=[ "1 (extremely positive/supportive)", "2 (positive/supportive)", "3 (neutral)", "4 (hateful/unsupportive)", "5 (extremely hateful/unsupportive)", "6 (can't say!)", "6 (can't say!)", "6 (can't say!)", ]) rg.log(rg.TextClassificationRecord(text="blablabla"), name="test-order-labels") rg.configure_dataset_settings(name="test-order-labels", settings=settings) ``` After that, go to the UI and check the labels appears in the provided order. **Checklist** - [x] I followed the style guidelines of this project - [x] I did a self-review of my code - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [ ] I filled out [the contributor form](https://tally.so/r/n9XrxK) (see text above) - [x] I have added relevant notes to the `CHANGELOG.md` file (See https://keepachangelog.com/) --------- Co-authored-by: Francisco Aranda <[email protected]>
davidberenstein1957
pushed a commit
that referenced
this issue
Nov 29, 2023
# Description This PR fixes a bug where the order of the labels for a Text Classification dataset provided in the class `TextClassificationSettings` was not preserved. This was happening because the `labels_schema` attribute had the `Set[str]` type to ensure there is no duplicate labels, but `set` doesn't preserver the order. Instead of using `set` to ensure there is no duplicates, `labels_schema` now has the `List[str]` type and a basic for loop has been added to ensure there is no duplicates. Closes #3828 **Type of change** - [x] Bug fix (non-breaking change which fixes an issue) **How Has This Been Tested** ```python import argilla as rg rg.set_workspace("argilla") settings = rg.TextClassificationSettings(label_schema=[ "1 (extremely positive/supportive)", "2 (positive/supportive)", "3 (neutral)", "4 (hateful/unsupportive)", "5 (extremely hateful/unsupportive)", "6 (can't say!)", "6 (can't say!)", "6 (can't say!)", ]) rg.log(rg.TextClassificationRecord(text="blablabla"), name="test-order-labels") rg.configure_dataset_settings(name="test-order-labels", settings=settings) ``` After that, go to the UI and check the labels appears in the provided order. **Checklist** - [x] I followed the style guidelines of this project - [x] I did a self-review of my code - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [ ] I filled out [the contributor form](https://tally.so/r/n9XrxK) (see text above) - [x] I have added relevant notes to the `CHANGELOG.md` file (See https://keepachangelog.com/) --------- Co-authored-by: Francisco Aranda <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
When setting the
label_schema
for aTextClassification
dataset the order in which the labels were provided is lost.Stacktrace and Code to create the bug
Expected behavior
The order in which the labels were provided is not lost.
Environment:
The text was updated successfully, but these errors were encountered: