`__all_labels__` method for `prepare_for_training` for `DatasetForTextClassification` expect a set #2482

davidberenstein1957 · 2023-03-06T12:01:14Z

Describe the bug
The prepare_for_training for spaCy textcat is giving me some trouble. I tracked it down to the all_labels() function, which given records with record.annotation = Consumer or Technology creates the following dict: {'m', 'r', 's', 'l', 'g', 'u', 'y', 'n', 't', 'h', 'o', 'c', 'e'} because set.update expects a set and receives a string:

To Reproduce
N.A.

Expected behavior
all_labels.update({record.annotation})

Screenshots
N.A.

Environment (please complete the following information):
N.A.

Additional context
@dvsrepo thanks for the catch

@Gnonpi

# [1.4.0](v1.3.1...v1.4.0) (2023-03-09) ### Features * `configure_dataset` accepts a workspace as argument ([#2503](#2503)) ([29c9ee3](29c9ee3)), * Add `active_client` function to main argilla module ([#2387](#2387)) ([4e623d4](4e623d4)), closes [#2183](#2183) * Add text2text support for prepare for training spark nlp ([#2466](#2466)) ([21efb83](21efb83)), closes [#2465](#2465) [#2482](#2482) * Allow passing workspace as client param for `rg.log` or `rg.load` ([#2425](#2425)) ([b3b897a](b3b897a)), closes [#2059](#2059) * Bulk annotation improvement ([#2437](#2437)) ([3fce915](3fce915)), closes [#2264](#2264) * Deprecate `chunk_size` in favor of `batch_size` for `rg.log` ([#2455](#2455)) ([3ebea76](3ebea76)), closes [#2453](#2453) * Expose `batch_size` parameter for `rg.load` ([#2460](#2460)) ([e25be3e](e25be3e)), closes [#2454](#2454) [#2434](#2434) * Extend shortcuts to include alphabet for token classification ([#2339](#2339)) ([4a92b35](4a92b35)) ### Bug Fixes * added flexible app redirect to docs page ([#2428](#2428)) ([5600301](5600301)), closes [#2377](#2377) * added regex match to set workspace method ([#2427](#2427)) ([d789fa1](d789fa1)), closes [#2388] * error when loading record with empty string query ([#2429](#2429)) ([fc71c3b](fc71c3b)), closes [#2400](#2400) [#2303](#2303) * Remove extra-action dropdown state after navigation ([#2479](#2479)) ([9328994](9328994)), closes [#2158](#2158) ### Documentation * Add AutoTrain to readme ([7199780](7199780)) * Add migration to label schema section ([#2435](#2435)) ([d57a1e5](d57a1e5)), closes [#2003](#2003) [#2003](#2003) * Adds zero+few shot tutorial with SetFit ([#2409](#2409)) ([6c679ad](6c679ad)) * Update readme with quickstart section and new links to guides ([#2333](#2333)) ([91a77ad](91a77ad)) ## As always, thanks to our amazing contributors! - Documentation update: adding missing n (#2362) by @Gnonpi - feat: Extend shortcuts to include alphabet for token classification (#2339) by @cceyda

…la-io#2466) # Description Added text2text support for spark-nlp training small bug-fix for prepare for training with spacy textcat Closes argilla-io#2465 Closes argilla-io#2482 **Type of change** (Please delete options that are not relevant. Remember to title the PR according to the type of change) - [X] New feature (non-breaking change which adds functionality) **How Has This Been Tested** N.A. **Checklist** N.A. --------- Co-authored-by: dvsrepo <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

davidberenstein1957 added the type: bug Indicates an unexpected problem or unintended behavior label Mar 6, 2023

davidberenstein1957 self-assigned this Mar 6, 2023

davidberenstein1957 added a commit that referenced this issue Mar 6, 2023

fix: bugfix for #2482

4028c05

davidberenstein1957 mentioned this issue Mar 6, 2023

Feat/2465 add text2text support for prepare for training spark nlp #2466

Merged

1 task

frascuchon closed this as completed in #2466 Mar 6, 2023

frascuchon closed this as completed in 21efb83 Mar 6, 2023

frascuchon mentioned this issue Mar 8, 2023

Release 1.4.0 #2500

Merged

frascuchon added this to the v1.4.0 milestone Mar 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`__all_labels__` method for `prepare_for_training` for `DatasetForTextClassification` expect a set #2482

`__all_labels__` method for `prepare_for_training` for `DatasetForTextClassification` expect a set #2482

davidberenstein1957 commented Mar 6, 2023

__all_labels__ method for prepare_for_training for DatasetForTextClassification expect a set #2482

__all_labels__ method for prepare_for_training for DatasetForTextClassification expect a set #2482

Comments

davidberenstein1957 commented Mar 6, 2023

`__all_labels__` method for `prepare_for_training` for `DatasetForTextClassification` expect a set #2482

`__all_labels__` method for `prepare_for_training` for `DatasetForTextClassification` expect a set #2482