feat: adding utils module and functions #4121

sdiazlor · 2023-11-02T16:56:54Z

Description

Adding a new utils module :

submodule: HTML (media_to_html and create_tokens_highlights)
submodule: records (assign records)

Closes #4030
Closes #4003
Closes #3803
Closes #3928
Closes #4031

Type of change

(Please delete options that are not relevant. Remember to title the PR according to the type of change)

New feature (non-breaking change which adds functionality)
Refactor (change restructuring the codebase without changing functionality)
Improvement (change adding some improvement to an existing functionality)

How Has This Been Tested

(Please describe the tests that you ran to verify your changes. And ideally, reference tests)

Test A
Test B

Checklist

I added relevant documentation
I followed the style guidelines of this project
I did a self-review of my code
I made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
I filled out the contributor form (see text above)
I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)

github-actions · 2023-11-02T17:34:03Z

The URL of the deployed environment for this PR is https://argilla-quickstart-pr-4121-ki24f765kq-no.a.run.app

davidberenstein1957

@sar

src/argilla/client/feedback/utils/html.py

davidberenstein1957 · 2023-11-06T14:07:58Z

src/argilla/client/feedback/utils/html.py

+
+
+def create_token_highlights(
+    tokens: List[str], weights: List[float], c_map: Optional[Union[str, Callable]] = None


Suggested change

tokens: List[str], weights: List[float], c_map: Optional[Union[str, Callable]] = None

tokens: List[str], weights: List[float], c_map: Optional[Union[str, Callable]] = "viridis"

@sdiazlor we would also need to make some changes later in the code.

@davidberenstein1957 yeah, of course, let me know to talk more about the approach

@sdiazlor we just need to remove the if None check we use lateron

davidberenstein1957 · 2023-11-06T16:44:47Z

src/argilla/client/feedback/utils/html.py

+}
+
+
+def media_to_html(media_type: str, path: str, file_type: Optional[str] = None) -> str:


Hi, could we also potentially allow for passing a byte string directly instead of passing a path? Perhaps we can check to allow for a path or non-b64encoded string.

davidberenstein1957 · 2023-11-07T12:23:37Z

src/argilla/client/feedback/utils/html.py

+}
+
+
+def media_to_html(media_type: str, media_source: Union[str, bytes], file_type: Optional[str] = None) -> str:


where is bytes defined?

davidberenstein1957 · 2023-11-07T12:25:52Z

src/argilla/client/feedback/utils/html.py

+    ]
+
+    # Get the color map if set to None or not indicated
+    if c_map is None:


we can remove this part

davidberenstein1957 · 2023-11-13T17:16:15Z

Hi @sdiazlor, I think we can just use a small or a subset of the examples to showcase the point without requiring users to download a lot of content from the HF datasets for the docs/_source/tutorials/notebooks/making-most-of-markdown.ipynb. Also, could you include some other reference to multi-modal support to the documentation and potentially some task_templates for for_image_classification/for_audio_classification/multi_model_classification or multi_modal_to_text (not really sure about the naming that covers the use case most explicitly)?

Perhaps we can add a reference to this section to the tutorial about making most of markdown? https://docs.argilla.io/en/latest/practical_guides/create_dataset.html#define-fields

davidberenstein1957 · 2023-11-14T14:59:59Z

@sdiazlor, this looks great :)

davidberenstein1957 · 2023-11-15T14:18:08Z

@sdiazlor , can you also add the same md reference in the markdown section for both questions and fields?

…he-package

sdiazlor · 2023-11-20T13:21:00Z

@davidberenstein1957 I added in assignment.py the whole assign_records logic (including the part for the docs), let me know your comments

davidberenstein1957 · 2023-11-20T13:24:56Z