Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENHANCEMENT] argilla: simplify structure for flatten records to list #5137

Merged

Conversation

frascuchon
Copy link
Member

@frascuchon frascuchon commented Jul 1, 2024

Pull Request Template

This PR changes the structure generated by to_list(flatten=True) to simplify reading responses. The response content is split into values and users, so no user ID is defined as part of the column name:

The result for the following record:

record = rg.Record(
    fields={"field": "The field"},
    metadata={"key": "value"},
    responses=[
        rg.Response(question_name="q1", value="value", user_id=user_a),
        rg.Response(question_name="q2", value="value", user_id=user_a),
        rg.Response(question_name="q2", value="value", user_id=user_b),
        rg.Response(question_name="q1", value="value", user_id=user_c),
    ],
    suggestions=[
        rg.Suggestion(question_name="q1", value="value", score=0.1, agent="test"),
        rg.Suggestion(question_name="q2", value="value", score=0.9),
    ],
)

is :

{
    "id": <record_id>,
    "_server_id": None,
    "field": "The field",
    "key": "value",
    "q1.responses": ["value", "value"],
    "q1.responses.users": [str(user_a), str(user_c)],
    "q2.responses": ["value", "value"],
    "q2.responses.users": [str(user_a), str(user_b)],
    "q1.suggestion": "value",
    "q1.suggestion.score": 0.1,
    "q1.suggestion.agent": "test",
    "q2.suggestion": "value",
    "q2.suggestion.score": 0.9,
    "q2.suggestion.agent": None,
}

Refs #4936

Type of change

  • Improvement (change adding some improvement to an existing functionality)

How Has This Been Tested

Checklist

  • I added relevant documentation
  • follows the style guidelines of this project
  • I did a self-review of my code
  • I made corresponding changes to the documentation
  • I confirm My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)

@frascuchon frascuchon marked this pull request as ready for review July 1, 2024 11:08
@frascuchon frascuchon self-assigned this Jul 1, 2024
@frascuchon frascuchon added this to the v2.0.0 milestone Jul 1, 2024
Copy link
Contributor

@burtenshaw burtenshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a real improvement.

One thing though, because responses are an array of values, the flatten param is no longer true. i.e. the structure is not flat. Should we consider changing the parameter name? For example, we could do nested and default to True.

argilla/src/argilla/records/_io/_datasets.py Outdated Show resolved Hide resolved
@frascuchon frascuchon merged commit 120160d into develop Jul 3, 2024
6 checks passed
@frascuchon frascuchon deleted the refactor/simplify-structure-for-flatten-records-to-list branch July 3, 2024 09:11
frascuchon added a commit that referenced this pull request Jul 3, 2024
# Pull Request Template
<!-- Please include a summary of the changes and the related issue.
Please also include relevant motivation and context. List any
dependencies that are required for this change. -->

> [!NOTE]  
> This PR should be merged after merge
#5137


This PR addresses problems when exporting records to dicts including
records partially filled. This PR fixes the errors commented in
#4936 (comment)

Close #4936

**Type of change**
<!-- Please delete options that are not relevant. Remember to title the
PR according to the type of change -->

- Bug fix (non-breaking change which fixes an issue)
- Refactor (change restructuring the codebase without changing
functionality)
- Improvement (change adding some improvement to an existing
functionality)

**How Has This Been Tested**
<!-- Please add some reference about how your feature has been tested.
-->

**Checklist**
<!-- Please go over the list and make sure you've taken everything into
account -->

- I added relevant documentation
- follows the style guidelines of this project
- I did a self-review of my code
- I made corresponding changes to the documentation
- I confirm My changes generate no new warnings
- I have added tests that prove my fix is effective or that my feature
works
- I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)

---------

Co-authored-by: burtenshaw <[email protected]>
Co-authored-by: Ben Burtenshaw <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants