-
Notifications
You must be signed in to change notification settings - Fork 377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate chunk_size
in favor of batch_size
for rg.log
#2455
Deprecate chunk_size
in favor of batch_size
for rg.log
#2455
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## develop #2455 +/- ##
===========================================
- Coverage 92.00% 91.63% -0.38%
===========================================
Files 162 162
Lines 7896 7899 +3
===========================================
- Hits 7265 7238 -27
- Misses 631 661 +30
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FutureWarning seems fine to me. I found some other chunk-size
references within the scan_data_response
methods for src/argilla/server/apis/v0/handlers
. IMO these can also be renamed, but @frascuchon not sure if you agree.
@frascuchon these methods
Yes, but I think this can be handled within the PR for the other issue. However, because the batch_size was set by default, I wanted to mention before we forget.
… On 2 Mar 2023, at 10:56, Tom Aarsen ***@***.***> wrote:
@tomaarsen commented on this pull request.
In src/argilla/client/client.py <#2455 (comment)>:
> @@ -115,7 +115,7 @@ class Argilla:
"""
# Larger sizes will trigger a warning
- _MAX_CHUNK_SIZE = 5000
+ _MAX_BATCH_SIZE = 5000
Sounds good, shall I introduce a warning like this for rg.load?
https://github.com/argilla-io/argilla/blob/f5834a5408051bf1fa60d42aede6b325dc07fdbd/src/argilla/client/client.py#L338-L343
—
Reply to this email directly, view it on GitHub <#2455 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGAZHZGJBTNKQQKDOCZMP4DW2BVFHANCNFSM6AAAAAAVNDKSSA>.
You are receiving this because your review was requested.
|
I'll include it in the other PR that I'm preparing :) |
No worries about that. These endpoints are not used anymore and will be removed in v1.5.0 |
# [1.4.0](v1.3.1...v1.4.0) (2023-03-09) ### Features * `configure_dataset` accepts a workspace as argument ([#2503](#2503)) ([29c9ee3](29c9ee3)), * Add `active_client` function to main argilla module ([#2387](#2387)) ([4e623d4](4e623d4)), closes [#2183](#2183) * Add text2text support for prepare for training spark nlp ([#2466](#2466)) ([21efb83](21efb83)), closes [#2465](#2465) [#2482](#2482) * Allow passing workspace as client param for `rg.log` or `rg.load` ([#2425](#2425)) ([b3b897a](b3b897a)), closes [#2059](#2059) * Bulk annotation improvement ([#2437](#2437)) ([3fce915](3fce915)), closes [#2264](#2264) * Deprecate `chunk_size` in favor of `batch_size` for `rg.log` ([#2455](#2455)) ([3ebea76](3ebea76)), closes [#2453](#2453) * Expose `batch_size` parameter for `rg.load` ([#2460](#2460)) ([e25be3e](e25be3e)), closes [#2454](#2454) [#2434](#2434) * Extend shortcuts to include alphabet for token classification ([#2339](#2339)) ([4a92b35](4a92b35)) ### Bug Fixes * added flexible app redirect to docs page ([#2428](#2428)) ([5600301](5600301)), closes [#2377](#2377) * added regex match to set workspace method ([#2427](#2427)) ([d789fa1](d789fa1)), closes [#2388] * error when loading record with empty string query ([#2429](#2429)) ([fc71c3b](fc71c3b)), closes [#2400](#2400) [#2303](#2303) * Remove extra-action dropdown state after navigation ([#2479](#2479)) ([9328994](9328994)), closes [#2158](#2158) ### Documentation * Add AutoTrain to readme ([7199780](7199780)) * Add migration to label schema section ([#2435](#2435)) ([d57a1e5](d57a1e5)), closes [#2003](#2003) [#2003](#2003) * Adds zero+few shot tutorial with SetFit ([#2409](#2409)) ([6c679ad](6c679ad)) * Update readme with quickstart section and new links to guides ([#2333](#2333)) ([91a77ad](91a77ad)) ## As always, thanks to our amazing contributors! - Documentation update: adding missing n (#2362) by @Gnonpi - feat: Extend shortcuts to include alphabet for token classification (#2339) by @cceyda
Closes #2453
Pull Request overview
chunk_size
in favor ofbatch_size
inrg.log
:chunk_size
to the end of all related signatures.chunk_size
default to None and update the typing accordingly.batch_size
in the old position in the signature.FutureWarning
ifchunk_size
is used.rg.log(..., chunk_size=100)
indeed throws a warning.Details
Note that this deprecation is non-breaking: Code that uses
chunk_size
will still work, asbatch_size = chunk_size
after a FutureWarning is given, ifchunk_size
is notNone
.Discussion
Type of change
How Has This Been Tested
I introduced a test, and ran all tests.
Checklist