Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 1.4.0 #2500

Merged
merged 53 commits into from
Mar 9, 2023
Merged

Release 1.4.0 #2500

merged 53 commits into from
Mar 9, 2023

Conversation

frascuchon
Copy link
Member

@frascuchon frascuchon commented Mar 8, 2023

1.4.0 (2023-03-09)

Features

Bug Fixes

Documentation

As always, thanks to our amazing contributors!

pre-commit-ci bot and others added 30 commits February 7, 2023 17:27
<!--pre-commit.ci start-->
updates:
- [github.com/psf/black: 22.12.0 →
23.1.0](psf/black@22.12.0...23.1.0)
- [github.com/pycqa/isort: 5.11.5 →
5.12.0](PyCQA/isort@5.11.5...5.12.0)
<!--pre-commit.ci end-->

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
# Description

Please include a summary of the changes and the related issue. Please
also include relevant motivation and context. List any dependencies that
are required for this change.

Closes #<issue_number>

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] Refactor (change restructuring the codebase without changing
functionality)
- [ ] Improvement (change adding some improvement to an existing
functionality)
- [ ] Documentation update

**How Has This Been Tested**

(Please describe the tests that you ran to verify your changes. And
ideally, reference `tests`)

- [ ] Test A
- [ ] Test B

**Checklist**

- [ ] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [ ] follows the style guidelines of this project
- [ ] I did a self-review of my code
- [ ] I added comments to my code
- [ ] I made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
# Description

Please include a summary of the changes and the related issue. Please
also include relevant motivation and context. List any dependencies that
are required for this change.

Closes #<issue_number>

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] Refactor (change restructuring the codebase without changing
functionality)
- [ ] Improvement (change adding some improvement to an existing
functionality)
- [ ] Documentation update

**How Has This Been Tested**

(Please describe the tests that you ran to verify your changes. And
ideally, reference `tests`)

- [ ] Test A
- [ ] Test B

**Checklist**

- [ ] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [ ] follows the style guidelines of this project
- [ ] I did a self-review of my code
- [ ] I added comments to my code
- [ ] I made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
Hello!

# Pull Request overview
* Replaced `isort` by `ruff` in `pre-commit`
  * This has fixed the import sorting throughout the repo.
* Manually went through `ruff` errors and fixed a bunch:
  * Unused imports
  * Added `if TYPE_CHECKING:` statements
* Placed `# noqa: <errcode>` where the warned-about statement is the
desired behaviour
* Add basic `ruff` configuration to `pyproject.toml`

## Details
This PR focuses on replacing `isort` by `ruff` in `pre-commit`. The
motivation for this is that:
* `isort` frequently breaks. I have experienced 2 separate occasions in
the last few months alone where the latest `isort` release has broken my
CI runs in NLTK and SetFit.
* `isort` is no longer supported for Python 3.7, whereas Argilla still
supports 3.7 for now.
* `ruff` is absurdly fast, I actually can't believe how quick it is.

This PR consists of 3 commits at this time, and I would advise looking
at them commit-by-commit rather than at the PR as a whole. I'll also
explain each commit individually.

## [Add ruff basic
configuration](497420e)
I've added basic configuration for
[`ruff`](https://github.com/charliermarsh/ruff), a very efficient
linter. I recommend the following commands:
```
# Get all [F]ailures and [E]rrors
ruff .

# See all import sort errors
ruff . --select I

# Fix all import sort errors
ruff . --select I --fix
```

## [Remove unused imports, apply some noqa's, add
TYPE_CHECKING](f219acb)
The unused imports speaks for itself.

As for the `noqa`'s, `ruff` (like most linters) respect the `# noqa` (no
quality assurance) keyword. I've used the keyword in various locations
where linters would warn, but the behaviour is actually correct. As a
result, the output of `ruff .` now only points to questionable code.

Lastly, I added `TYPE_CHECKING` in some locations. If type hints hint at
objects that do not need to be imported during run-time, then it's
common to type hint like `arr: "numpy.ndarray"`. However, IDE's won't
understand what `arr` is. Python has implemented `TYPE_CHECKING` which
can be used to conditionally import code *only* when type checking. As a
result, the code block is not actually executed in practice, but the
inclusion of it allows for IDEs to better support development.
See an example here:
```python
from typing import TYPE_CHECKING

if TYPE_CHECKING:
    import numpy

def func(arr: "numpy.ndarray") -> None:
    ...
```

## [Replace isort with ruff in
CI](e05f30e)
I've replaced `isort` (which was both [broken for
5.11.*](PyCQA/isort#2077) and [does not work
for Python 3.8 in
5.12.*](https://github.com/PyCQA/isort/releases/tag/5.12.0)) with `ruff`
in the CI, using both `--select I` to only select `isort` warnings and
`--fix` to immediately fix the warnings.

Then I ran `pre-commit run --all` to fix the ~67 outstanding issues in
the repository.

---

**Type of change**

- [x] Refactor (change restructuring the codebase without changing
functionality)

**How Has This Been Tested**

I verified that the behaviour did not change using `pytest tests`.

**Checklist**

- [x] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [ ] I added comments to my code
- [ ] I made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works

- Tom Aarsen

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…2333)

# Description

- [x] Documentation update

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Closes #2257

Hello!

## Pull Request overview
* Add `validate_assignment = True` to the `pydantic` Config to also
apply validation on assignment e.g. via `record.prediction = ...`.
* Add tests to ensure the correct behaviour.

## What was wrong?
See #2257 for details on the issue. The core issue is that the following
script would fail with a server-side error on `rg.log`:
```python
import argilla as rg

record = rg.TextClassificationRecord(
    text="Hello world, this is me!",
    prediction=[("LABEL1", 0.8), ("LABEL2", 0.2)],
    annotation="LABEL1",
    multi_label=False,
)
rg.log(record, "test_item_assignment")
record.prediction = "rubbish"
rg.log(record, "test_item_assignment")
```
<details><summary>Click to see failure logs using the develop
branch</summary>

```
023-02-14 11:19:40.794 | ERROR    | argilla.client.client:__log_internal__:103 - 
Cannot log data in dataset 'test_item_assignment'
Error: ValueError
Details: not enough values to unpack (expected 2, got 1)
Traceback (most recent call last):
  File "c:/code/argilla/demo_2257.py", line 13, in <module>
    rg.log(record, "test_item_assignment")
  File "C:\code\argilla\src\argilla\client\api.py", line 157, in log
    background=background,
  File "C:\code\argilla\src\argilla\client\client.py", line 305, in log
    return future.result()
  File "C:\Users\tom\.conda\envs\argilla\lib\concurrent\futures\_base.py", line 435, in result
    return self.__get_result()
  File "C:\Users\tom\.conda\envs\argilla\lib\concurrent\futures\_base.py", line 384, in __get_result
    raise self._exception
  File "C:\code\argilla\src\argilla\client\client.py", line 107, in __log_internal__
    raise ex
  File "C:\code\argilla\src\argilla\client\client.py", line 99, in __log_internal__
    return await api.log_async(*args, **kwargs)
  File "C:\code\argilla\src\argilla\client\client.py", line 389, in log_async
    records=[creation_class.from_client(r) for r in chunk],
  File "C:\code\argilla\src\argilla\client\client.py", line 389, in <listcomp>
    records=[creation_class.from_client(r) for r in chunk],
  File "C:\code\argilla\src\argilla\client\sdk\text_classification\models.py", line 65, in from_client
    for label, score in record.prediction
  File "C:\code\argilla\src\argilla\client\sdk\text_classification\models.py", line 64, in <listcomp>
    ClassPrediction(**{"class": label, "score": score})
ValueError: not enough values to unpack (expected 2, got 1)
  0%|                                                                                                                                                     | 0/1 [00:00<?, ?it/s]
```
</details>
This is unnecessary, as we can create a client-side error using our
validators, too.

## The fix
The fix is as simple as instructing `pydantic` to also trigger the
validation on assignment, e.g. on `record.prediction = ...`.

## Relevant documentation
See `validate_assignment` in
https://docs.pydantic.dev/usage/model_config/#options.

## After the fix
Using the same script as before, the error now becomes:
```
Traceback (most recent call last):
  File "c:/code/argilla/demo_2257.py", line 11, in <module>
    record.prediction = "rubbish"
  File "C:\code\argilla\src\argilla\client\models.py", line 280, in __setattr__
    super().__setattr__(name, value)
  File "pydantic\main.py", line 445, in pydantic.main.BaseModel.__setattr__
pydantic.error_wrappers.ValidationError: 1 validation error for TextClassificationRecord
prediction
  value is not a valid list (type=type_error.list)
```
Which triggers directly on the assignment rather than only when the
record is logged.

---

**Type of change**

- [x] Improvement (change adding some improvement to an existing
functionality)

**How Has This Been Tested**

All existing tests passed with my changes. Beyond that, I added some new
tests of my own. These tests fail on the `develop` branch, but pass on
this branch. This is because previously it was allowed to perform
`record.prediction = "rubbish"` as no validation was executed on it.

**Checklist**

- [x] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [x] I added comments to my code
- [ ] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works

---

- Tom Aarsen
…ion of spans (#2329)

# Description

I expanded slightly on the error message provided when providing spans
that do not match the tokenization.
Consider the following example script:
```python
import argilla as rg

record = rg.TokenClassificationRecord(
    text = "This is my text",
    tokens=["This", "is", "my", "text"],
    prediction=[("ORG", 0, 6), ("PER", 8, 14)],
)
```
The (truncated) output on the `develop` branch:
```
ValueError: Following entity spans are not aligned with provided tokenization
Spans:
- [This i] defined in This is my ...
- [my tex] defined in ...s is my text
Tokens:
['This', 'is', 'my', 'text']
```
The distinction between `defined in` and `This is` is unclear. I've
worked on this.

The (truncated) output after this PR:
```
ValueError: Following entity spans are not aligned with provided tokenization
Spans:
- [This i] defined in 'This is my ...'
- [my tex] defined in '...s is my text'
Tokens:
['This', 'is', 'my', 'text']
```
Note the additional `'`. Note that the changes rely on `repr`, so if the
snippet contains `'` itself, it uses `"` instead, e.g.:
```python
import argilla as rg

record = rg.TokenClassificationRecord(
    text = "This is Tom's text",
    tokens=["This", "is", "Tom", "'s", "text"],
    prediction=[("ORG", 0, 6), ("PER", 8, 16)],
)
```
```
ValueError: Following entity spans are not aligned with provided tokenization
Spans:
- [This i] defined in 'This is Tom...'
- [Tom's te] defined in "...s is Tom's text"
Tokens:
['This', 'is', 'Tom', "'s", 'text']
```

**Type of change**

- [x] Improvement (change adding some improvement to an existing
functionality)

**How Has This Been Tested**

Modified the relevant tests, ensured they worked.

**Checklist**

- [x] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [ ] I added comments to my code
- [ ] I made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works

- Tom Aarsen

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Upgrade the black line-length parameter in the project toml  file.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Closes #1843

Hello!

## Pull Request overview
* Use [`rich`](https://github.com/Textualize/rich) for logging,
tracebacks, printing and progressbars.
* Add `rich` as a dependency.
* Remove `loguru` as a dependency and remove all mentions of it in the
codebase.
* Simplify logging configuration according to the logging documentation.
* Update logging tests.

## Before & After
[`rich`](https://github.com/Textualize/rich) is a large Python library
for very colorful formatting in the terminal. Most importantly (in my
opinion), it improves the readability of logs and tracebacks. Let's go
over some before-and-afters:
<details><summary>Printing, Logging & Progressbars</summary>

### Before:

![image](https://user-images.githubusercontent.com/37621491/219089678-e57906d3-568d-480e-88a4-9240397f5229.png)

### After:

![image](https://user-images.githubusercontent.com/37621491/219089826-646d57a6-7e5b-426f-9ab1-d6d6317ec885.png)

Note that for the logs, the repeated information on the left side is
removed. Beyond that, the file location from which the log originates is
moved to the right side. Beyond that, the progressbar has been updated,
ahd the URL in the printed output has been highlighted automatically.

</details>

<details><summary>Tracebacks</summary>

### Before:

![image](https://user-images.githubusercontent.com/37621491/219090868-42cfe128-fd98-47ec-9d38-6f6f52a21373.png)

### After:

![image](https://user-images.githubusercontent.com/37621491/219090903-86f1fe11-d509-440d-8a6a-2833c344707b.png)

---

### Before:

![image](https://user-images.githubusercontent.com/37621491/219091343-96bae874-a673-4281-80c5-caebb67e348e.png)

### After:

![image](https://user-images.githubusercontent.com/37621491/219091193-d4cb1f64-11a7-4783-a9b2-0aec1abb8eb7.png)

---

### Before

![image](https://user-images.githubusercontent.com/37621491/219091791-aa8969a1-e0c1-4708-a23d-38d22c2406f2.png)

### After

![image](https://user-images.githubusercontent.com/37621491/219091878-e24c1f6b-83fa-4fed-9705-ede522faee82.png)

</details>

## Notes
Note that there are some changes in the logging configuration. Most of
all, it has been simplified according to the note from
[here](https://docs.python.org/3/library/logging.html#logging.Logger.propagate).
In my changes, I only attach our handler to the root logger and let
propagation take care of the rest.

Beyond that, I've set `rich` to 13.0.1 as newer versions experience a
StopIteration error like discussed
[here](Textualize/rich#2800 (comment)).

I've replaced `tqdm` with `rich` Progressbar when logging. However, I've
kept the `tqdm` progressbar for the [Weak
Labeling](https://github.com/argilla-io/argilla/blob/develop/src/argilla/labeling/text_classification/weak_labels.py)
for now.

One difference between the old situation and now is that all of the logs
are displayed during `pytest` under "live log call" (so, including
expected errors), while earlier only warnings were shown.

## What to review?
Please do the following when reviewing:
1. Ensuring that `rich` is correctly set to be installed whenever
someone installs `argilla`. I always set my dependencies explicitly in
setup.py like
[here](https://github.com/nltk/nltk/blob/develop/setup.py#L115) or
[here](https://github.com/huggingface/setfit/blob/78851287535305ef32f789c7a87004628172b5b6/setup.py#L47-L48),
but the one for `argilla` is
[empty](https://github.com/argilla-io/argilla/blob/develop/setup.py),
and `pyproject.toml` is used instead. I'd like for someone to look this
over.
2. Fetch this branch and run some arbitrary code. Load some data, log
some data, crash some programs, and get an idea of the changes.
Especially changes to loggers and tracebacks can be a bit personal, so
I'd like to get people on board with this. Otherwise we can scrap it or
find a compromise. After all, this is also a design PR.
3. Please have a look at my discussion points below.

## Discussion
`rich` is quite configurable, so there's some changes that we can make
still.
1. The `RichHandler` logging handler can be modified to e.g. include
rich tracebacks in their logs as discussed
[here](https://rich.readthedocs.io/en/latest/logging.html#handle-exceptions).
Are we interested in this?
2. The `rich` traceback handler can be set up to include local variables
in its traceback:
<details><summary>Click to see a rich traceback with local
variables</summary>


![image](https://user-images.githubusercontent.com/37621491/219096029-796b57ee-2f1b-485f-af35-c3effd44200b.png)
    </details>
Are we interested in this? I think this is a bit overkill in my opinion.
3. We can suppress frames from certain Python modules to exclude them
from the rich tracebacks. Are we interested in this?
4. The default rich traceback shows a maximum of 100 frames, which is a
*lot*. Are we interested in reducing this to only show the first and
last X?
5. The progress bar doesn't automatically stretch to fill the full
available width, while `tqdm` does. If we want, we can set `expand=True`
and it'll also expand to the entire width. Are we interested in this?
6. The progress "bar" does not need to be a bar, we can also use e.g. a
spinner animation. See some more info
[here](https://rich.readthedocs.io/en/latest/progress.html#columns). Are
we interested in this?

---

**Type of change**

- [x] Refactor (change restructuring the codebase without changing
functionality)

**How Has This Been Tested**

I've updated the tests according to my changes.

**Checklist**

- [x] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [x] I added comments to my code
- [ ] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works

- Tom Aarsen
# Description

Replace old `recogn.ai` emails with `argilla.io` ones.

This will improve [argilla Pypi package
page](https://pypi.org/project/argilla/) showing the correct contact
emails.

We still have a reference to `recogn.ai` on this JS model file:
https://github.com/argilla-io/argilla/blob/develop/frontend/models/Dataset.js#L22

Please @frascuchon and @keithCuniah can you confirm if that JS code can
be safely changed and how?
# Description

This PR move the labeling rule operations from the labeling rule service
to the text-classfication service.

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [x] Refactor (change restructuring the codebase without changing
functionality)
- [ ] Improvement (change adding some improvement to an existing
functionality)
- [ ] Documentation update


**Checklist**

- [x] I have merged the original branch into my forked branch
- [x] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [x] I added comments to my code
- [x] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works
updates:
- [github.com/charliermarsh/ruff-pre-commit: v0.0.244 → v0.0.249](astral-sh/ruff-pre-commit@v0.0.244...v0.0.249)
# Description

There's an "n" character missing at the end of the sentence, not a super
critical change

**Type of change**

- [x] Documentation update

**How Has This Been Tested**

By eye

**Checklist**

- [x] I have merged the original branch into my forked branch
- [x] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [x] I added comments to my code
- [x] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works
Hello!

## Pull Request overview
* Remove `pyre` from the CI.
* Remove the `pyre` configuration file.

## Details
Pyre has not been shown useful. It only causes CI failures, and I don't
think anyone actually looks at the logs. Let's save ourselves to red
crosses and remove it. After all, the CI doesn't even run on the
`develop` branch, only on `main` (and `master` and `releases`).

See
[here](https://github.com/argilla-io/argilla/actions/runs/4184361432/jobs/7249906904)
for an example CI output of the `pyre` workflow.

**Type of change**

- [x] Removal of unused code

**Checklist**

- [x] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [ ] I added comments to my code
- [ ] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works

---

I intend to create a handful of other PRs to improve the CI situation.
In particular, I want green checkmarks to once again mean that a PR is
likely ready, and for red crosses to mean that it most certainly is not.

- Tom Aarsen
# Description

Adding dataset `workspace` attribute and deprecating `owner` attribute
for current Dataset model. The `owner` will be removed in the next
release.

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [x] Refactor (change restructuring the codebase without changing
functionality)
- [ ] Improvement (change adding some improvement to an existing
functionality)
- [ ] Documentation update


**Checklist**

- [x] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [x] I added comments to my code
- [x] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works

---------

Co-authored-by: keithCuniah <[email protected]>
# Description

This PR allows to  fetch the active argilla client instance as follow:

```python
import argilla as rg
client = rg.active_client()

# from here, we can interact with API without extra header configuration
client.http_client.get("/api/datasets")
```

Closes #2183

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] Refactor (change restructuring the codebase without changing
functionality)
- [x] Improvement (change adding some improvement to an existing
functionality)
- [ ] Documentation update


**Checklist**

- [x] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [x] I added comments to my code
- [ ] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
…2373)

# Description

For old argilla versions, workspaces were not required to create
datasets. Some code changes were included to support this behavior

This PR clean these code sections and refactor them for those code parts
where old logic was used. For example, the `user.is_superuser` flag is
computed from the workspace list setup but also persisted to improve the
code usability.

We need still to provide a way to migrate those datasets that have no
dataset to a default workspace. @davidberenstein1957 can you provide a
section in docs where to include these steps from the release notes?

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [x] Refactor (change restructuring the codebase without changing
functionality)
- [ ] Improvement (change adding some improvement to an existing
functionality)
- [ ] Documentation update


**Checklist**

- [x] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [x] I added comments to my code
- [ ] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works
# Description
If a code needs comments, except in some particular cases, it's not
clean code.
So maybe it's better to remove the line `I added comments to my code`
from the PR template.

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] Refactor (change restructuring the codebase without changing
functionality)
- [ ] Improvement (change adding some improvement to an existing
functionality)
- [x] Documentation update

**How Has This Been Tested**

(Please describe the tests that you ran to verify your changes. And
ideally, reference `tests`)

- N/A

**Checklist**

- N/A I have merged the original branch into my forked branch
- N/A I added relevant documentation
- N/A follows the style guidelines of this project
- N/A I did a self-review of my code
- N/A I added comments to my code
- N/A I made corresponding changes to the documentation
- N/A My changes generate no new warnings
- N/A I have added tests that prove my fix is effective or that my
feature works
<!--pre-commit.ci start-->
updates:
- [github.com/charliermarsh/ruff-pre-commit: v0.0.244 →
v0.0.249](astral-sh/ruff-pre-commit@v0.0.244...v0.0.249)
<!--pre-commit.ci end-->
Hello!

## Pull Request overview
* Skip CI jobs rather than letting them fail in 2 common scenarios:
  1. If a PR is created from a fork.
  2. If a PR does not contain code changes.

## Problems
### Scenario 1
If a PR is created from a fork, then certain secrets will be
inaccessible, e.g. `secrets.AR_DOCKER_USERNAME` and
`secrets.AR_DOCKER_PASSWORD`. This causes the "Docker Deploy" job to
fail, understandably. It would be better if these jobs would be skipped
instead.
See
[here](https://github.com/argilla-io/argilla/actions/runs/4240437103/jobs/7369703791)
for an example case.

### Scenario 2
There are two subsequent jobs: One to build the Python package, and one
to deploy it as a Docker image. The first one will stop halfway through
if there are no code changes found in the PR, but the second job will
still try to deploy the build artifact. Upon trying to load it, it will
crash.
See
[here](https://github.com/argilla-io/argilla/actions/runs/4229327600/jobs/7345607762)
for an example case.

## Fixes
### Scenario 1
Sadly, secrets are not accessible on the `build.if` and
`deploy_docker.if` level, so we can't use e.g. `if:
secrets.AR_DOCKER_USERNAME != ''`. Instead, we are required to use an
environment variable, e.g. `IS_DEPLOYABLE` to track whether the secrets
are accessible, and then add `if env.IS_DEPLOYABLE == 'true'` to all
steps.

### Scenario 2
This was as simple as setting `code_outputs` as an output from the
`build` job, which can then be read as input for the `deploy_docker`
job. This job is now skipped if there are no code changes in the
PR/commit.

---

**Type of change**

- [x] Refactor (change restructuring the codebase without changing
functionality)

**How Has This Been Tested**

Through this PR. Note: I have only been able to test scenario 1 (PR from
a fork). I haven't been able to test scenario 2 nor the "normal"
scenarios where these jobs do actually execute (i.e. first-party PR with
code changes). These are annoying to test easily. I think we're best off
merging this and reverting if any scenarios work unexpectedly, rather
than trying to test all of the scenarios in a separate repo.

**Checklist**

- [x] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [ ] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works

---

With this PR I hope to have reached the point where CI failures imply
actual issues and CI passes imply that a PR is "likely ready".

- Tom Aarsen
Hello!

# Description

As discussed in Slack, I've replaced "ar" with "rg" as the shorthand
import in the tests. Long live project-wide "find-and-replace"
functionality.

**Type of change**

- [x] Refactor (change restructuring the codebase without changing
functionality)

**How Has This Been Tested**

Running the test suite.

**Checklist**

- [x] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [ ] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works

---

- Tom Aarsen
…ator (#2380)

Closes #2367

Hello!

## Pull Request overview
* Implementation for `require_version` based on the transformer module.
* Implementation for `requires_version`, a decorator that wraps
`require_version`
* `requires_datasets`, `requires_spacy` and `requires_sklearn` as
shorthands for `requires_version` for specific modules.
* Removed `try-except ModuleNotFoundError` throughout the project for
e.g. `cleanlab`, `snorkel`, `flyingsquid`, `pgmpy`, `starlette` and
replaced them with `requires_version` decorators.
* Added tests to accompany the additions
* Modified existing tests

## Details
I have added `require_version` which ought to be used like:
```python
...
if train_size and test_size:
    require_version("scikit-learn")
    from sklearn.model_selection import train_test_split
    ...
```

And also `requires_version`:
```python
...
@requires_version("cleanlab")
def find_label_errors(
    records: Union[List[TextClassificationRecord], DatasetForTextClassification],
    ...
```
and some shorthands, e.g.:
```python
...

@requires_datasets
def to_datasets() -> "datasets.Dataset":
    """Exports your records to a `datasets.Dataset`.
    ...
```

When `require_version` is called, or when a function/method wrapped with
`requires_...` is called,
[`importlib.metadata`](https://docs.python.org/3/library/importlib.metadata.html)
is used to see if the package is installed, and at which version. This
is the recommended approach for collecting version information, i.e.
recommended over the less efficient `pkg_resources`.

### Advanced usage
We can set version requirements using this approach:
```python
require_version("datasets>1.17.0")
# or
@requires_version("datasets>1.17.0")
def ...
```
And not just one, we can set multiple:
```python
require_version("datasets>1.17.0,<2.0.0")
# or
@requires_version("datasets>1.17.0,<2.0.0")
def ...
```
We can also specify Python versions for certain
functions/methods/sections of code:
```python
require_version("python>3.7")
# or
@requires_version("python>3.7")
def ...
```

### Missing & outdated dependencies
Consider the following example:
```python
from argilla.utils.dependency import requires_datasets

@requires_datasets
def foo():
    pass

foo()
```
When executed without `datasets` installed, the following exception is
thrown:
```
ModuleNotFoundError: 'datasets' must be installed to use `foo`! You can install 'datasets' with this command: `pip install datasets>1.17.0`
```
Alternatively, if I install `datasets==1.16.0` and run it again, I get
this exception:
```
ImportError: datasets>1.17.0 must be installed to use `foo`, but found datasets==1.16.0. You can install a supported version of 'datasets' with this command: `pip install -U datasets>1.17.0`
```

### Notes
I had to update various tests that contained the following snippet:
```python
monkeypatch.setitem(sys.modules, "datasets", None)
with pytest.raises(ModuleNotFoundError):
    ...
```
The reasoning is that although `import datasets` would fail in this
case, the `require_version` function is not fooled by this monkeypatch
as it reads directly from files in `sys.meta_path`. As a result,
`require_version` still recognises that the module is installed. That
way, it becomes a bit harder to pretend that a module is not installed.
An alternative that I went with is clearing out `sys.meta_path`, which
means that nothing can be imported after the monkeypatch.

One other tiny note: the following snippet also stopped working.
```
monkeypatch.setattr(FlyingSquid, "_predict", mock_predict)
```
This is because `FlyingSquid` is now decorated by `requires_version`,
and this seems to have broken the `monkeypatch.setattr`. As a result, I
now perform the `monkeypatch.setattr` on the class instance instead of
the class.

I want to reiterate that everything from this section only affects the
tests: the behaviour in practice works well.

Lastly, the tests result in a 97% coverage on the new file.

---

**Type of change**

- [x] Refactor (change restructuring the codebase without changing
functionality)

**How Has This Been Tested**

The introduction of various tests under `tests/utils/test_dependency.py`
and through updating tests throughout the project.
I also ran some test cases manually.

**Checklist**

- [x] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [x] I added comments to my code
- [ ] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works

---

I'm open to feedback, as always.

cc: @davidberenstein1957 @dvsrepo due to the discussion in #2367

- Tom Aarsen
updates:
- [github.com/charliermarsh/ruff-pre-commit: v0.0.249 → v0.0.253](astral-sh/ruff-pre-commit@v0.0.249...v0.0.253)
<!--pre-commit.ci start-->
updates:
- [github.com/charliermarsh/ruff-pre-commit: v0.0.249 →
v0.0.253](astral-sh/ruff-pre-commit@v0.0.249...v0.0.253)
<!--pre-commit.ci end-->
…2339)

# Description

This is a stop-gap solution to issue
#1852
A more sophisticated solution can be found but in the mean time this
solves the inconvenience when there are tags more than 10.

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [x] Improvement (change adding some improvement to an existing
functionality)

**How Has This Been Tested**

(Please describe the tests that you ran to verify your changes. And
ideally, reference `tests`)

- [ ] Test A
- [ ] Test B

**Checklist**

- [ ] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [ ] follows the style guidelines of this project
- [x] I did a self-review of my code
- [ ] I added comments to my code
- [ ] I made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
…et (#2434)

Hello!

## Pull Request overview
* Refactor `Datasets.scan` to never request more samples than required.
* Implements a `batch_size` variable which is now set to
`self.DEFAULT_SCAN_SIZE`. This parameter can easily be exposed in the
future for `batch_size` support, as discussed previously in Slack etc.

## Details
The core of the changes is that the URL is now built on-the-fly at every
request, rather than just once. For each request, the limit for that
request is computed using `min(limit, batch_size)`, and `limit` is
decremented, allowing it to represent a "remaining samples to fetch".
When `limit` reaches 0, i.e. 0 more samples to fetch, we return from
`scan`.

I've also added a check to ensure that the `limit` passed to `scan`
can't negative.

## Tests
For tests, I've created a `test_scan_efficient_limiting` function which
verifies the new and improved behaviour. It contains two monkeypatches:
1. `DEFAULT_SCAN_SIZE` is set to 10. Because our dataset in this test
only has 100 samples, we want to ensure that we can't just sample the
entire dataset with one request.
2. The `http_client.post` method is monkeypatched to allow us to track
the calls to the server.

We test the following scenarios:
* `limit=23` -> 3 requests: for 10, 10 and 3 samples.
* `limit=20` -> 2 requests: for 10, 10 samples.
* `limit=6` -> 1 request: for 6 samples.

There's also a test to cover the new ValueError if `limit` < 0.

## Effects
Consider the following script:
```python
import argilla as rg
import time

def test_scan_records(fields):
    client = rg.active_client()

    data = client.datasets.scan(
        name="banking77-topics-setfit",
        projection=fields,
        limit=1,  # <- Note, just one sample
    )
    start_t = time.time()
    list(data)
    print(f"load time for 1 sample with fields={fields}: {time.time() - start_t:.4f}s")

test_scan_records(set())
test_scan_records({"text"})
test_scan_records({"tokens"})
```

On this PR, this outputs:
```
load time for 1 sample with fields=set(): 0.0774s
load time for 1 sample with fields={'text'}: 0.0646s
load time for 1 sample with fields={'tokens'}: 0.0669s
```
On the `develop` branch, this outputs:
```
load time for 1 sample with fields=set(): 0.1355s
load time for 1 sample with fields={'text'}: 0.1347s
load time for 1 sample with fields={'tokens'}: 0.1173s
```
This can be repeated for `rg.load(..., limit=1)`, as that relies on
`.scan` under the hood.

Note that this is the most extreme example of performance gains. The
performance increases in almost all scenarios if `limit` is set, but the
effects are generally not this big. Going from fetching 250 samples 8
times to fetching 250 samples 7 times and 173 once doesn't have as big
of an impact.

---

**Type of change**

- [x] Refactor (change restructuring the codebase without changing
functionality)

**How Has This Been Tested**

See above.

**Checklist**

- [x] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [ ] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works

---

- Tom Aarsen
# Description

I re-used the logic for the dataset regex check for the workspace
declaration too.

Closes #2388 
**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [X] Bug fix (non-breaking change which fixes an issue)


**How Has This Been Tested**

(Please describe the tests that you ran to verify your changes. And
ideally, reference `tests`)

- [X] [Test
A](https://github.com/argilla-io/argilla/blob/5138961e4148749dca85e647fa32d8c9a13f87f2/tests/client/test_api.py#L97)


**Checklist**
davidberenstein1957 and others added 16 commits March 2, 2023 12:24
# Description
empty query strings were not returning any records. These are currently
set to None and therefore return records again.

Closes #2400 
Closes #2303 
**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [X] Bug fix (non-breaking change which fixes an issue)

**How Has This Been Tested**

(Please describe the tests that you ran to verify your changes. And
ideally, reference `tests`)

- [x] [Test
A](https://github.com/argilla-io/argilla/blob/34da84b5b0ac2c71d17459519915630dda7393a2/tests/client/test_api.py#L179)


**Checklist**

N.A.
This PR will simplify current datasets-related endpoints to adapt to the
new API definition.

- [x] Moving pydantic api models under the `server.schemas.datasets`
module
- [x] Making workspace required for dataset creation
- [x] Add dataset id to dataset endpoints response
- [x] Working with `dataset.workspace` instead of `dataset.owner`
- [x] Avoid dataset inheritance as much as possible

---------

Co-authored-by: Tom Aarsen <[email protected]>
Co-authored-by: David Berenstein <[email protected]>
Co-authored-by: keithCuniah <[email protected]>
Co-authored-by: Keith Cuniah <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: José Francisco Calvo <[email protected]>
# Description

Preparing new DB integration, some workspace validation and logic will
be removed. This PR prepares this integration by forcing sending
workspace on each request, instead using the `default_workspace`.

For the current clients and UIs, this won't be a problem since the
workspace is sent on each request right now.

**Type of change**

- [x] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [x] Refactor (change restructuring the codebase without changing
functionality)

**Checklist**

- [x] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [ ] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works
…d` (#2425)

# Description

Allow passing workspace as client parm for rglog or rgload. I also
enabled this for `rg.delete` and `rg.delete_records`.

Closes #2059 
**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)


- [X] New feature (non-breaking change which adds functionality)
- [X] Refactor (change restructuring the codebase without changing
functionality)
- [X] Improvement (change adding some improvement to an existing
functionality)


**How Has This Been Tested**

(Please describe the tests that you ran to verify your changes. And
ideally, reference `tests`)

- [Test
A](https://github.com/argilla-io/argilla/blob/b227b09a423dae8792e520ce4f9da3dcb5fb0d0d/tests/client/test_api.py#L364)


**Checklist**

- [X] I have merged the original branch into my forked branch

- [X] follows the style guidelines of this project
- [X] I did a self-review of my code
- [] I made corresponding changes to the documentation
- [X] My changes generate no new warnings
- [X] I have added tests that prove my fix is effective or that my
feature works
…2455)

Closes #2453

## Pull Request overview
* Deprecate `chunk_size` in favor of `batch_size` in `rg.log`:
  * Move `chunk_size` to the end of all related signatures.
  * Set `chunk_size` default to None and update the typing accordingly.
  * Introduce `batch_size` in the old position in the signature.
  * Update docstrings accordingly.
  * Introduce a `FutureWarning` if `chunk_size` is used.
* Introduce test showing that `rg.log(..., chunk_size=100)` indeed
throws a warning.
* Update a warning to no longer include a newline and a lot of spaces
(see first comment of this PR)

## Details
Note that this deprecation is non-breaking: Code that uses `chunk_size`
will still work, as `batch_size = chunk_size` after a FutureWarning is
given, if `chunk_size` is not `None`.

## Discussion
* Should I use a FutureWarning? Or a DeprecationWarning? Or a
PendingDeprecationWarning? The last two make sense, but they are
[ignored by
default](https://docs.python.org/3/library/warnings.html#default-warning-filter),
I'm afraid.
* Is the deprecation message in the format that we like?

---

**Type of change**
- [x] New feature (non-breaking change which adds functionality)

**How Has This Been Tested**

I introduced a test, and ran all tests.

**Checklist**

- [x] I have merged the original branch into my forked branch
- [x] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [x] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works

---

- Tom Aarsen
Closes #2454 

Hello!

## Pull Request overview
* Extend my previous work to expose a `batch_size` parameter to
`rg.load` (and `rg.scan`, `_load_records_new_fashion`).
  * Update the docstrings for the public methods.
* The actual implementation is very simple due to my earlier work in
#2434:
* `batch_size = self.DEFAULT_SCAN_SIZE` -> `batch_size = batch_size or
self.DEFAULT_SCAN_SIZE`
* Add an (untested!) warning if `batch_size` is supplied with an old
server version.
  * (We don't have compatibility tests for old server versions)
* I've simplified `test_scan_efficient_limiting` with `batch_size`
rather than using a mocker to update `DEFAULT_SCAN_SIZE`.
* I've also extended this test to test `rg.load` in addition to
`active_api().datasets.scan`.

## Note
Unlike previously discussed with @davidberenstein1957, I have *not*
added a warning like so:

https://github.com/argilla-io/argilla/blob/f5834a5408051bf1fa60d42aede6b325dc07fdbd/src/argilla/client/client.py#L338-L343
The server limits the maximum batch size to 500 for loading, so there is
no need to have a warning when using a batch size of over 5000.

## Discussion
1. Should I include in the docstring that the default batch size is 250?
That would be "hardcoded" into the docs, so if we ever change the
default (`self.DEFAULT_SCAN_SIZE`), then we would have to remember to
update the docs too.
2. Alternatively, should we deprecate `self.DEFAULT_SCAN_SIZE` and
enforce that `batch_size` must be set for `datasets.scan`, with a
default size of 250 everywhere? Then people can see the default batch
size in the signature?

I think we should do option 2. Let me know how you feel.

---

**Type of change**

- [x] New feature (non-breaking change which adds functionality)

**How Has This Been Tested**

Modified and updated a test, ran all tests.

**Checklist**

- [x] I have merged the original branch into my forked branch
- [x] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [x] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works

---

- Tom Aarsen
# Description

I added more flexible app redirect to `api/docs`

Closes #2377 

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [X] Bug fix (non-breaking change which fixes an issue)


**How Has This Been Tested**
N.A.

**Checklist**

N.A.

---------

Co-authored-by: frascuchon <[email protected]>
# Description

Added text2text support for spark-nlp training
small bug-fix for prepare for training with spacy textcat

Closes #2465
Closes #2482 

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)


- [X] New feature (non-breaking change which adds functionality)

**How Has This Been Tested**

N.A.

**Checklist**
N.A.

---------

Co-authored-by: dvsrepo <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
<!--pre-commit.ci start-->
updates:
- [github.com/charliermarsh/ruff-pre-commit: v0.0.253 →
v0.0.254](astral-sh/ruff-pre-commit@v0.0.253...v0.0.254)
<!--pre-commit.ci end-->

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
# Description
Added sort functionality to `rg.load` like so. `rg.load("name",
limit=100, sort=[("field", "asc|desc")])`

Closes #2433

**Type of change**

- [X] New feature (non-breaking change which adds functionality)
- [X] Documentation update

**How Has This Been Tested**

(Please describe the tests that you ran to verify your changes. And
ideally, reference `tests`)

- [X] [Test
A](https://github.com/argilla-io/argilla/blob/65031609fbad263e41ead233f7335bed83f22268/tests/client/test_api.py#L636)


**Checklist**

- [X] I have merged the original branch into my forked branch
- [X] I added relevant documentation
- [X] follows the style guidelines of this project
- [X] I did a self-review of my code
- [X] I made corresponding changes to the documentation
- [X] My changes generate no new warnings
- [X] I have added tests that prove my fix is effective or that my
feature works

---------

Co-authored-by: Francisco Aranda <[email protected]>
# Description

This PR improves annotation at record level and at bulk level for the
three different tasks

- Normalize the 2-step validation flow for all the tasks, except the
single label text classification
- Enhance the bulk annotation labeling for multi-label text
classification
- Disable automatic validation for multi-label text classification
- Include reset and clear for all task
- New styles

Closes [#2264](#2264)

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] Refactor (change restructuring the codebase without changing
functionality)
- [ ] Improvement (change adding some improvement to an existing
functionality)
- [ ] Documentation update

**How Has This Been Tested**

(Please describe the tests that you ran to verify your changes. And
ideally, reference `tests`)

- [ ] Test A
- [ ] Test B

**Checklist**

- [x] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [x] follows the style guidelines of this project
- [x] I did a self-review of my code
- [ ] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works

---------

Co-authored-by: Keith Cuniah <[email protected]>
Co-authored-by: keithCuniah <[email protected]>
@frascuchon frascuchon changed the base branch from develop to main March 8, 2023 08:40
@codecov
Copy link

codecov bot commented Mar 8, 2023

Codecov Report

Patch coverage: 94.26% and project coverage change: +0.54 🎉

Comparison is base (6c679ad) 91.97% compared to head (3a91bbc) 92.51%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2500      +/-   ##
==========================================
+ Coverage   91.97%   92.51%   +0.54%     
==========================================
  Files         161      161              
  Lines        7923     7954      +31     
==========================================
+ Hits         7287     7359      +72     
+ Misses        636      595      -41     
Flag Coverage Δ
pytest 92.51% <94.26%> (+0.54%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/argilla/client/apis/metrics.py 92.30% <ø> (ø)
src/argilla/client/sdk/commons/errors.py 90.74% <ø> (ø)
src/argilla/client/sdk/text2text/api.py 0.00% <ø> (ø)
src/argilla/client/sdk/text2text/models.py 100.00% <ø> (ø)
src/argilla/client/sdk/token_classification/api.py 0.00% <0.00%> (ø)
.../argilla/client/sdk/token_classification/models.py 100.00% <ø> (ø)
src/argilla/client/sdk/users/api.py 100.00% <ø> (ø)
src/argilla/monitoring/_transformers.py 94.04% <ø> (ø)
src/argilla/monitoring/model_monitor.py 100.00% <ø> (ø)
src/argilla/server/commons/models.py 100.00% <ø> (ø)
... and 112 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

frascuchon and others added 4 commits March 8, 2023 11:30
1. Client will always send the workspace (even the username for
"Private" workspaces)
2. The workspace is not required for this current version, since old
clients don't send workspace when matches with username
Just a minimal change using new the `rg.active_client` function for
migration scripts. This allows copy-pasting the code snippet without
upgrading argilla.
The `rg.configure_dataset` accepts a workspace name as a param.

The inner dataset creation request will use the workspace inside the
request body. (See
[here](1cae0aa#diff-54f1605eb0a3cdc8cdd1c9c9e4628c5e30b3460588e640e38dfcfb95d8cd97e3L70))
# Description

Please include a summary of the changes and the related issue. Please
also include relevant motivation and context. List any dependencies that
are required for this change.

Closes #<issue_number>

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] Refactor (change restructuring the codebase without changing
functionality)
- [ ] Improvement (change adding some improvement to an existing
functionality)
- [x] Documentation update

**How Has This Been Tested**

(Please describe the tests that you ran to verify your changes. And
ideally, reference `tests`)

- [ ] Test A
- [ ] Test B

**Checklist**

- [x] I have merged the original branch into my forked branch
- [ ] I added relevant documentation
- [ ] follows the style guidelines of this project
- [ ] I did a self-review of my code
- [x] I made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
@frascuchon frascuchon marked this pull request as ready for review March 9, 2023 00:35
@frascuchon frascuchon merged commit cf75dd5 into main Mar 9, 2023
@damianpumar damianpumar deleted the releases/1.4.0 branch April 22, 2024 08:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants