argilla-io · davidberenstein1957 · Nov 27, 2023 · Nov 2, 2023 · Nov 6, 2023 · Nov 6, 2023
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -22,6 +22,7 @@ These are the section headers that we use:
 - Added `get_model_kwargs`, `get_trainer_kwargs`, `get_trainer_model`, `get_trainer_tokenizer` and `get_trainer` -methods to the `ArgillaTrainer` to improve interoperability across frameworks. ([#4214](https://github.com/argilla-io/argilla/pull/4214)).
 - Added additional formatting checks to the `ArgillaTrainer` to allow for better interoperability of `defaults` and `formatting_func` usage. ([#4214](https://github.com/argilla-io/argilla/pull/4214)).
 - Added a warning to the `update_config`-method of `ArgillaTrainer` to emphasize if the `kwargs` were updated correctly. ([#4214](https://github.com/argilla-io/argilla/pull/4214)).
+- Added `argilla.client.feedback.utils` module with `html_utils` and `assignments`. ([#4121](https://github.com/argilla-io/argilla/pull/4121))
 
 ### Fixed
 

diff --git a/docs/_source/_static/tutorials/making-most-of-markdown/audio-multimodal.PNG b/docs/_source/_static/tutorials/making-most-of-markdown/audio-multimodal.PNG
diff --git a/docs/_source/_static/tutorials/making-most-of-markdown/displacy.PNG b/docs/_source/_static/tutorials/making-most-of-markdown/displacy.PNG
diff --git a/docs/_source/_static/tutorials/making-most-of-markdown/image-multimodal.PNG b/docs/_source/_static/tutorials/making-most-of-markdown/image-multimodal.PNG
diff --git a/docs/_source/_static/tutorials/making-most-of-markdown/multi-modal.png b/docs/_source/_static/tutorials/making-most-of-markdown/multi-modal.png
diff --git a/docs/_source/_static/tutorials/making-most-of-markdown/video-multimodal.PNG b/docs/_source/_static/tutorials/making-most-of-markdown/video-multimodal.PNG
diff --git a/docs/_source/practical_guides/create_dataset.md b/docs/_source/practical_guides/create_dataset.md
@@ -51,7 +51,11 @@ You can define the fields using the Python SDK providing the following arguments
 - `name`: The name of the field, as it will be seen internally.
 - `title` (optional): The name of the field, as it will be displayed in the UI. Defaults to the `name` value, but capitalized.
 - `required` (optional): Whether the field is required or not. Defaults to `True`. Note that at least one field must be required.
-- `use_markdown`(optional): Specify whether you want markdown rendered in the UI. Defaults to `False`.
+- `use_markdown` (optional): Specify whether you want markdown rendered in the UI. Defaults to `False`. If you set it to `True`, you will be able to use all the Markdown features for text formatting and embedded multimedia content. To delve further into the details, please refer to this [tutorial](/tutorials/notebooks/making-most-of-markdown.ipynb).
+
+```{note}
+Multimedia in Markdown is here, but it's still in the experimental phase. As we navigate the early stages, there are limits on file sizes due to ElasticSearch constraints, and the visualization and loading times may vary depending on your browser. We're on the case to improve this and welcome your feedback and suggestions!
+```
 
 ```python
 fields = [
@@ -86,7 +90,11 @@ The following arguments apply to specific question types:
 - `values`: In the `RatingQuestion` this will be any list of unique integers that represent the options that annotators can choose from. These values must be defined in the range [1, 10]. In the `RankingQuestion`, values will be a list of strings with the options they will need to rank. If you'd like the text of the options to be different in the UI and internally, you can pass a dictionary instead where the key is the internal name and the value is the text to display in the UI.
 - `labels`: In `LabelQuestion` and `MultiLabelQuestion` this is a list of strings with the options for these questions. If you'd like the text of the labels to be different in the UI and internally, you can pass a dictionary instead where the key is the internal name and the value the text to display in the UI.
 - `visible_labels` (optional): In `LabelQuestion` and `MultiLabelQuestion` this is the number of labels that will be visible in the UI. By default, the UI will show 20 labels and collapse the rest. Set your preferred number to change this limit or set `visible_labels=None` to show all options.
-- `use_markdown` (optional): In `TextQuestion` define whether the field should render markdown text. Defaults to `False`.
+- `use_markdown` (optional): In `TextQuestion` define whether the field should render markdown text. Defaults to `False`. If you set it to `True`, you will be able to use all the Markdown features for text formatting and embedded multimedia content. To delve further into the details, please refer to this [tutorial](/tutorials/notebooks/making-most-of-markdown.ipynb).
+
+```{note}
+Multimedia in Markdown is here, but it's still in the experimental phase. As we navigate the early stages, there are limits on file sizes due to ElasticSearch constraints, and the visualization and loading times may vary depending on your browser. We're on the case to improve this and welcome your feedback and suggestions!
+```
 
 Check out the following tabs to learn how to set up questions according to their type:
 

diff --git a/docs/_source/tutorials/notebooks/making-most-of-markdown.ipynb b/docs/_source/tutorials/notebooks/making-most-of-markdown.ipynb
diff --git a/src/argilla/client/feedback/utils/__init__.py b/src/argilla/client/feedback/utils/__init__.py
@@ -0,0 +1,39 @@
+#  Copyright 2021-present, the Recognai S.L. team.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+
+
+from argilla.client.feedback.utils.assignment import (
+    assign_records,
+    assign_records_to_groups,
+    assign_records_to_individuals,
+    assign_workspaces,
+    check_user,
+    check_workspace,
+)
+from argilla.client.feedback.utils.html_utils import (
+    audio_to_html,
+    create_token_highlights,
+    image_to_html,
+    media_to_html,
+    video_to_html,
+)
+
+__all__ = [
+    "audio_to_html",
+    "video_to_html",
+    "image_to_html",
+    "create_token_highlights",
+    "assign_records",
+    "assign_workspaces",
+]
diff --git a/src/argilla/client/feedback/utils/assignment.py b/src/argilla/client/feedback/utils/assignment.py
@@ -0,0 +1,262 @@
+#  Copyright 2021-present, the Recognai S.L. team.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+
+
+import random
+import warnings
+from collections import defaultdict
+from typing import Any, Dict, List, Union
+
+from rich.progress import Progress
+
+from argilla.client.users import User
+from argilla.client.workspaces import Workspace
+
+
+def check_user(user_to_check: Union[str, User]) -> User:
+    """
+    Helper function to check if the input is a User object. If it's a string, it attempts to retrieve the User object.
+    If the User does not exist, it creates a new User with a default password and role.
+
+    Args:
+        user_to_check: a user object or a string that represents a username
+
+    Returns:
+        The User object corresponding to the input.
+    """
+    if isinstance(user_to_check, User):
+        user = user_to_check
+    else:
+        try:
+            user = User.from_name(user_to_check)
+        except ValueError:
+            user = User.create(username=user_to_check, first_name=user_to_check, password="12345678", role="annotator")
+            warnings.warn(
+                f"The user {user.username} was created with a default password. We recommend you to change it for security reasons.",
+                UserWarning,
+            )
+    return user
+
+
+def check_workspace(workspace_to_check: str) -> Workspace:
+    """
+    Helper function to check if the workspace exists. If it does not exist, it creates a new one.
+
+    Args:
+        workspace_to_check: a workspace string name
+
+    Returns:
+        The Workspace object corresponding to the input.
+    """
+    try:
+        workspace = Workspace.from_name(workspace_to_check)
+    except:
+        workspace = Workspace.create(workspace_to_check)
+    return workspace
+
+
+def assign_records_to_groups(
+    groups: Dict[str, List[Any]], records: List[Any], overlap: int, shuffle: bool = True
+) -> Dict[str, Dict[str, Any]]:
+    """
+    Assign records to predefined groups with controlled overlap (for the groups) and optional shuffle. All members of the same group will annotate the same records.
+
+    Args:
+        groups: A dictionary where keys are group names and values are lists of users names or objects.
+        records: A list of records to be assigned.
+        overlap: The number of times each record is assigned to consecutive groups (0 for no overlap).
+        shuffle: If True, shuffle the records before assignment. Defaults to True.
+
+    Returns:
+        A dictionary where each key is a group and its value is another dictionary, which maps usernames to their respective assigned records.
+
+    Raises:
+        ValueError: If `overlap` is higher than the number of groups or negative.
+    """
+    if overlap < 0 or overlap >= len(groups.keys()):
+        raise ValueError("Overlap must be less than the number of groups and must not be negative.")
+
+    if len(records) < len(groups.keys()):
+        warnings.warn(
+            f"The number of groups is higher than the number of records. Some users will not be assigned any records.",
+            UserWarning,
+        )
+
+    if shuffle:
+        random.shuffle(records)
+
+    assignments = {}
+    assignments_grouped = {}
+    group_names = list(groups.keys())
+    num_groups = len(group_names)
+    overlap = 1 if overlap == 0 else overlap
+
+    group_records = defaultdict(list)
+    with Progress() as progress:
+        task = progress.add_task("[green]Processing records...", total=len(records))
+
+        for idx, record in enumerate(records):
+            for offset in range(overlap):
+                group_index = (idx + offset) % num_groups
+                group_name = group_names[group_index]
+                group_records[group_name].append(record)
+
+            progress.update(task, advance=1)
+
+    for group, users in groups.items():
+        users = [check_user(user) for user in users]
+        for user in users:
+            assignments[user] = group_records[group]
+
+        assignments_grouped[group] = {user.username: assignments.get(user, []) for user in users}
+
+    return assignments_grouped
+
+
+def assign_records_to_individuals(
+    users: List[Any], records: List[Any], overlap: int, shuffle: bool = True
+) -> Dict[str, List[Any]]:
+    """
+    Assign records to users with controlled overlap and optional shuffle.
+
+    Args:
+        users: A list of user objects, each with a 'username' attribute.
+        records: A list of record objects to be assigned to users.
+        overlap: The number of times each record is assigned to consecutive users (0 for no overlap).
+        shuffle: If True, the records list will be shuffled before assignment. Defaults to True.
+
+    Returns:
+        A dictionary where keys are usernames and values are lists of assigned records.
+
+    Raises:
+        ValueError: If `overlap` is higher than the number of users or negative.
+    """
+    if overlap < 0 or overlap >= len(users):
+        raise ValueError("Overlap must be less than the number of users and must not be negative.")
+
+    if len(records) < len(users):
+        warnings.warn(
+            f"The number of users is higher than the number of records. Some users will not be assigned any records.",
+            UserWarning,
+        )
+
+    if shuffle:
+        random.shuffle(records)
+
+    users = [check_user(user) for user in users]
+    assignments = {user.username: [] for user in users}
+
+    num_users = len(users)
+    overlap = 1 if overlap == 0 else overlap
+
+    with Progress() as progress:
+        task = progress.add_task("[green]Processing records...", total=len(records))
+
+        for idx, record in enumerate(records):
+            for offset in range(overlap):
+                user_index = (idx + offset) % num_users
+                user = users[user_index].username
+                assignments[user].append(record)
+
+            progress.update(task, advance=1)
+
+    return assignments
+
+
+def assign_records(
+    users: Union[Dict[str, List[Any]], List[Any]], records: List[Any], overlap: int, shuffle: bool = True
+) -> Union[Dict[str, List[Any]], Dict[str, Dict[str, Any]]]:
+    """
+    Assign records to either groups or individuals, with controlled overlap and optional shuffle.
+
+    Args:
+        users: Either a dictionary of groups or a list of individual user objects.
+        records: A list of record objects to be assigned.
+        overlap: The number of times each record is assigned to consecutive users or groups (0 for no overlap).
+        shuffle: If True, the records list will be shuffled before assignment. Defaults to True.
+
+    Returns:
+        A dictionary where each key is a group and its value is another dictionary, which maps usernames to their respective assigned records.
+        Or a dictionary where keys are usernames and values are lists of assigned records.
+
+    Examples:
+        >>> from argilla.client.feedback.utils import assign_records
+        >>> individual_assignments = assign_records([user1, user2, user3], records, 0, False)
+        >>> group_assignments = assign_records({group1: [user1, user2], group2: [user3]}, records, 1, False)
+
+    """
+    if isinstance(users, dict):
+        return assign_records_to_groups(users, records, overlap, shuffle)
+    elif isinstance(users, list):
+        return assign_records_to_individuals(users, records, overlap, shuffle)
+
+
+def assign_workspaces(
+    assignments: Union[Dict[str, List[Any]], Dict[str, Dict[str, Any]]], workspace_type: str
+) -> Dict[str, List[Any]]:
+    """
+    Assign workspaces (and create them if needed) to either groups or individuals.
+
+    Args:
+        assignments: Either a dictionary of groups or a dictionary of users.
+        workspace_type: Either 'group' (each group in a workspace), 'group_personal' (each member in a workspace) or 'individual' (each person in a workspace).
+
+    Returns:
+        A dictionary where each key is a workspace name and its value is a list of user names.
+
+    Examples:
+        >>> from argilla.client.feedback.utils import assign_workspaces
+        >>> wk_assignments = assign_workspaces(group_assignments, "group")
+        >>> wk_assignments = assign_workspaces(group_assignments, "group_personal")
+        >>> wk_assignments = assign_workspaces(individual_assignments, "individual")
+
+    """
+    wk_assignments = {}
+
+    for group, users in assignments.items():
+        if workspace_type == "group":
+            workspace_name = group
+            user_ids = [check_user(user).id for user in users.keys()]
+
+        elif workspace_type == "group_personal":
+            for user in users.keys():
+                workspace_name = user
+                user_ids = [check_user(user).id]
+                workspace = check_workspace(workspace_name)
+
+                for user_id in user_ids:
+                    try:
+                        workspace.add_user(user_id)
+                    except:
+                        pass
+
+                wk_assignments[workspace_name] = [User.from_id(user).username for user in workspace.users]
+
+            continue
+
+        elif workspace_type == "individual":
+            workspace_name = group
+            user_ids = [check_user(group).id]
+
+        workspace = check_workspace(workspace_name)
+
+        for user_id in user_ids:
+            try:
+                workspace.add_user(user_id)
+            except:
+                pass
+
+        wk_assignments[workspace_name] = [User.from_id(user).username for user in workspace.users]
+
+    return wk_assignments