Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/2465 add text2text support for prepare for training spark nlp #2466

Conversation

davidberenstein1957
Copy link
Member

@davidberenstein1957 davidberenstein1957 commented Mar 2, 2023

Description

Added text2text support for spark-nlp training
small bug-fix for prepare for training with spacy textcat

Closes #2465
Closes #2482

Type of change

(Please delete options that are not relevant. Remember to title the PR according to the type of change)

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested

N.A.

Checklist
N.A.

@davidberenstein1957 davidberenstein1957 linked an issue Mar 2, 2023 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Mar 2, 2023

Codecov Report

Patch coverage: 95.23% and project coverage change: +0.50 🎉

Comparison is base (40ca933) 92.12% compared to head (b10bba4) 92.63%.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #2466      +/-   ##
===========================================
+ Coverage    92.12%   92.63%   +0.50%     
===========================================
  Files          161      161              
  Lines         7885     7915      +30     
===========================================
+ Hits          7264     7332      +68     
+ Misses         621      583      -38     
Flag Coverage Δ
pytest 92.63% <95.23%> (+0.50%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/argilla/client/datasets.py 86.07% <95.23%> (+9.77%) ⬆️
...gilla/labeling/text_classification/label_errors.py 86.41% <0.00%> (-3.71%) ⬇️
src/argilla/client/apis/datasets.py 97.19% <0.00%> (-0.06%) ⬇️
src/argilla/client/client.py 87.20% <0.00%> (-0.04%) ⬇️
src/argilla/client/api.py 97.67% <0.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

davidberenstein1957 and others added 11 commits March 2, 2023 17:51
…re_for_training-spark-nlp

# Conflicts:
#	tests/client/test_dataset.py
…g-spark-nlp' of github.com:argilla-io/argilla into feat/2465-add-text2text-support-for-prepare_for_training-spark-nlp

* 'feat/2465-add-text2text-support-for-prepare_for_training-spark-nlp' of github.com:argilla-io/argilla:
  [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci
@frascuchon frascuchon merged commit 21efb83 into develop Mar 6, 2023
@frascuchon frascuchon deleted the feat/2465-add-text2text-support-for-prepare_for_training-spark-nlp branch March 6, 2023 21:53
@frascuchon frascuchon mentioned this pull request Mar 8, 2023
frascuchon added a commit that referenced this pull request Mar 9, 2023
# [1.4.0](v1.3.1...v1.4.0)
(2023-03-09)

### Features

* `configure_dataset` accepts a workspace as argument
([#2503](#2503))
([29c9ee3](29c9ee3)),
* Add `active_client` function to main argilla module
([#2387](#2387))
([4e623d4](4e623d4)),
closes [#2183](#2183)
* Add text2text support for prepare for training spark nlp
([#2466](#2466))
([21efb83](21efb83)),
closes [#2465](#2465)
[#2482](#2482)
* Allow passing workspace as client param for `rg.log` or `rg.load`
([#2425](#2425))
([b3b897a](b3b897a)),
closes [#2059](#2059)
* Bulk annotation improvement
([#2437](#2437))
([3fce915](3fce915)),
closes [#2264](#2264)
* Deprecate `chunk_size` in favor of `batch_size` for `rg.log`
([#2455](#2455))
([3ebea76](3ebea76)),
closes [#2453](#2453)
* Expose `batch_size` parameter for `rg.load`
([#2460](#2460))
([e25be3e](e25be3e)),
closes [#2454](#2454)
[#2434](#2434)
* Extend shortcuts to include alphabet for token classification
([#2339](#2339))
([4a92b35](4a92b35))


### Bug Fixes

* added flexible app redirect to docs page
([#2428](#2428))
([5600301](5600301)),
closes [#2377](#2377)
* added regex match to set workspace method
([#2427](#2427))
([d789fa1](d789fa1)),
closes [#2388]
* error when loading record with empty string query
([#2429](#2429))
([fc71c3b](fc71c3b)),
closes [#2400](#2400)
[#2303](#2303)
* Remove extra-action dropdown state after navigation
([#2479](#2479))
([9328994](9328994)),
closes [#2158](#2158)


### Documentation

* Add AutoTrain to readme
([7199780](7199780))
* Add migration to label schema section
([#2435](#2435))
([d57a1e5](d57a1e5)),
closes [#2003](#2003)
[#2003](#2003)
* Adds zero+few shot tutorial with SetFit
([#2409](#2409))
([6c679ad](6c679ad))
* Update readme with quickstart section and new links to guides
([#2333](#2333))
([91a77ad](91a77ad))


## As always, thanks to our amazing contributors!
- Documentation update: adding missing n (#2362) by @Gnonpi
- feat: Extend shortcuts to include alphabet for token classification
(#2339) by @cceyda
cceyda pushed a commit to cceyda/argilla that referenced this pull request Apr 25, 2023
…la-io#2466)

# Description

Added text2text support for spark-nlp training
small bug-fix for prepare for training with spacy textcat

Closes argilla-io#2465
Closes argilla-io#2482 

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)


- [X] New feature (non-breaking change which adds functionality)

**How Has This Been Tested**

N.A.

**Checklist**
N.A.

---------

Co-authored-by: dvsrepo <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants