Feat/2465 add text2text support for prepare for training spark nlp #2466

davidberenstein1957 · 2023-03-02T15:25:51Z

Description

Added text2text support for spark-nlp training
small bug-fix for prepare for training with spacy textcat

Closes #2465
Closes #2482

Type of change

(Please delete options that are not relevant. Remember to title the PR according to the type of change)

New feature (non-breaking change which adds functionality)

How Has This Been Tested

N.A.

Checklist
N.A.

…repare_for_training-spark-nlp

codecov · 2023-03-02T15:40:40Z

Codecov Report

Patch coverage: 95.23% and project coverage change: +0.50 🎉

Comparison is base (40ca933) 92.12% compared to head (b10bba4) 92.63%.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #2466      +/-   ##
===========================================
+ Coverage    92.12%   92.63%   +0.50%     
===========================================
  Files          161      161              
  Lines         7885     7915      +30     
===========================================
+ Hits          7264     7332      +68     
+ Misses         621      583      -38

Flag	Coverage Δ
pytest	`92.63% <95.23%> (+0.50%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
src/argilla/client/datasets.py	`86.07% <95.23%> (+9.77%)`	⬆️
...gilla/labeling/text_classification/label_errors.py	`86.41% <0.00%> (-3.71%)`	⬇️
src/argilla/client/apis/datasets.py	`97.19% <0.00%> (-0.06%)`	⬇️
src/argilla/client/client.py	`87.20% <0.00%> (-0.04%)`	⬇️
src/argilla/client/api.py	`97.67% <0.00%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

…re_for_training-spark-nlp # Conflicts: # tests/client/test_dataset.py

…re_for_training-spark-nlp

for more information, see https://pre-commit.ci

…g-spark-nlp' of github.com:argilla-io/argilla into feat/2465-add-text2text-support-for-prepare_for_training-spark-nlp * 'feat/2465-add-text2text-support-for-prepare_for_training-spark-nlp' of github.com:argilla-io/argilla: [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci

@Gnonpi

# [1.4.0](v1.3.1...v1.4.0) (2023-03-09) ### Features * `configure_dataset` accepts a workspace as argument ([#2503](#2503)) ([29c9ee3](29c9ee3)), * Add `active_client` function to main argilla module ([#2387](#2387)) ([4e623d4](4e623d4)), closes [#2183](#2183) * Add text2text support for prepare for training spark nlp ([#2466](#2466)) ([21efb83](21efb83)), closes [#2465](#2465) [#2482](#2482) * Allow passing workspace as client param for `rg.log` or `rg.load` ([#2425](#2425)) ([b3b897a](b3b897a)), closes [#2059](#2059) * Bulk annotation improvement ([#2437](#2437)) ([3fce915](3fce915)), closes [#2264](#2264) * Deprecate `chunk_size` in favor of `batch_size` for `rg.log` ([#2455](#2455)) ([3ebea76](3ebea76)), closes [#2453](#2453) * Expose `batch_size` parameter for `rg.load` ([#2460](#2460)) ([e25be3e](e25be3e)), closes [#2454](#2454) [#2434](#2434) * Extend shortcuts to include alphabet for token classification ([#2339](#2339)) ([4a92b35](4a92b35)) ### Bug Fixes * added flexible app redirect to docs page ([#2428](#2428)) ([5600301](5600301)), closes [#2377](#2377) * added regex match to set workspace method ([#2427](#2427)) ([d789fa1](d789fa1)), closes [#2388] * error when loading record with empty string query ([#2429](#2429)) ([fc71c3b](fc71c3b)), closes [#2400](#2400) [#2303](#2303) * Remove extra-action dropdown state after navigation ([#2479](#2479)) ([9328994](9328994)), closes [#2158](#2158) ### Documentation * Add AutoTrain to readme ([7199780](7199780)) * Add migration to label schema section ([#2435](#2435)) ([d57a1e5](d57a1e5)), closes [#2003](#2003) [#2003](#2003) * Adds zero+few shot tutorial with SetFit ([#2409](#2409)) ([6c679ad](6c679ad)) * Update readme with quickstart section and new links to guides ([#2333](#2333)) ([91a77ad](91a77ad)) ## As always, thanks to our amazing contributors! - Documentation update: adding missing n (#2362) by @Gnonpi - feat: Extend shortcuts to include alphabet for token classification (#2339) by @cceyda

…la-io#2466) # Description Added text2text support for spark-nlp training small bug-fix for prepare for training with spacy textcat Closes argilla-io#2465 Closes argilla-io#2482 **Type of change** (Please delete options that are not relevant. Remember to title the PR according to the type of change) - [X] New feature (non-breaking change which adds functionality) **How Has This Been Tested** N.A. **Checklist** N.A. --------- Co-authored-by: dvsrepo <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

davidberenstein1957 added 2 commits March 2, 2023 16:23

fix: added Text2Text support for prepare_for_training spark-nlp

01cf83b

fix: removed prepare_for_training NotImplemented for Text2Text dataset

04f5ec4

davidberenstein1957 linked an issue Mar 2, 2023 that may be closed by this pull request

add Text2Text support for prepare_for_training spark-nlp #2465

Closed

chore: merge branch 'main' into feat/2465-add-text2text-support-for-p…

8686756

…repare_for_training-spark-nlp

davidberenstein1957 and others added 11 commits March 2, 2023 17:51

fix: updated prepare_for_training_test

1787cba

Merge branch 'develop' into feat/2465-add-text2text-support-for-prepa…

5b27d94

…re_for_training-spark-nlp # Conflicts: # tests/client/test_dataset.py

chore: updated test coverage

fb869a7

fix: bugfix for #2482

4028c05

Merge branch 'develop' into feat/2465-add-text2text-support-for-prepa…

e098237

…re_for_training-spark-nlp

Type checking and fixes for single and multilabel datasets

1ec8530

[pre-commit.ci] auto fixes from pre-commit.com hooks

bdf95a9

for more information, see https://pre-commit.ci

fix problem with all_labels

bb25b39

fix: added additional tests

2390cec

fix: remove setting HF_HUB_ACCESS_TOKEN to None

b10bba4

davidberenstein1957 requested a review from frascuchon March 6, 2023 15:59

frascuchon approved these changes Mar 6, 2023

View reviewed changes

frascuchon merged commit 21efb83 into develop Mar 6, 2023

frascuchon deleted the feat/2465-add-text2text-support-for-prepare_for_training-spark-nlp branch March 6, 2023 21:53

frascuchon mentioned this pull request Mar 8, 2023

Release 1.4.0 #2500

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/2465 add text2text support for prepare for training spark nlp #2466

Feat/2465 add text2text support for prepare for training spark nlp #2466

davidberenstein1957 commented Mar 2, 2023 •

edited

Loading

codecov bot commented Mar 2, 2023 •

edited

Loading

Feat/2465 add text2text support for prepare for training spark nlp #2466

Feat/2465 add text2text support for prepare for training spark nlp #2466

Conversation

davidberenstein1957 commented Mar 2, 2023 • edited Loading

Description

codecov bot commented Mar 2, 2023 • edited Loading

Codecov Report

davidberenstein1957 commented Mar 2, 2023 •

edited

Loading

codecov bot commented Mar 2, 2023 •

edited

Loading