feat: add v1 enpoint to set `Response` status #2756

gabrielmbmb · 2023-04-23T14:51:52Z

Description

I've added a new endpoint PUT /api/v1/responses/{response_id}/status that allows setting the status of a Response.

To do so, I've added a new column to the Response model/table called status, which right now has 3 possible values: pending (which is the default value), submitted, and discarded. I've also updated the alembic migrations files to include this new column, which is not nullable (it will have a default value always), and also an index will be created for it, as I suppose that we would like to filter by it and having an index will allow querying using a filter on this column faster.

Regarding the endpoint definition, I've decided to just add a new PUT endpoint to update the status of the Response instead of using the PUT /v1/response/{response_id} endpoint, because I think:

It makes the API's intent more explicit and easier to understand.
It has better separation of concerns as we can more easily optimize and scale this specific operation independently from the update response operation. Also, if in the future is needed to do something after status has been set, it will be easier this way than having an if condition in the update response endpoint that checks if the status has been updated, and if so then do something specific for when the status has changed.

As an alternative to this new endpoint, we could have:

Created two endpoints (PUT /api/v1/responses/{response_id}/submit and PUT /api/v1/responses/{response_id}/discard), one for submitting and one for discarding, but I think this is not a good idea, because if a new possible status is introduced, then a new endpoint has to be created and maintained. The logic of both endpoints will be exactly the same, so we would have two endpoints almost identical. Also, this approach would require the client to send a PUT request to a different endpoint depending on the status, and this can make the API harder to use.
Create one endpoint very similar to the one implemented but passing the status via URL path PUT /api/v1/responses/{response_id}/status/{status}. This approach has the inconvenience that the client would have to send a PUT request to a different endpoint each time the status has to be changed.

Closes #2733

Type of change

(Please delete options that are not relevant. Remember to title the PR according to the type of change)

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Refactor (change restructuring the codebase without changing functionality)
Improvement (change adding some improvement to an existing functionality)
Documentation update

How Has This Been Tested

I've added unit tests for all the possible scenarios when calling this endpoint.

Checklist

I have merged the original branch into my forked branch
I added relevant documentation
follows the style guidelines of this project
I did a self-review of my code
I made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)

gabrielmbmb · 2023-04-23T14:58:18Z

A review @frascuchon and @jfcalvo would be much appreciated! I would like to also hear your opinion. Also, I've some questions:

Do we need to save the id of the user that set the status for a Response? Right now only the user that created the response or an admin can do it, so I guess is not that important.
As only the user that created the response and the admin can set the status for a Response, I guess that the ResponsePolicyV1.update is enough, but in the future if users can set the status for a response that has not been created by them, then I guess that a new policy would be needed.

…endpoint

jfcalvo · 2023-04-24T09:50:49Z

Hi @gabrielmbmb, thanks for the PR. I will take a look to it.

Regarding your questions:

Do we need to save the id of the user that set the status for a Response? Right now only the user that created the response or an admin can do it, so I guess is not that important.

Although is not within the immediate scope of the issue, if you have the time and desire to work on it, you're welcome to do so.

As only the user that created the response and the admin can set the status for a Response, I guess that the ResponsePolicyV1.update is enough, but in the future if users can set the status for a response that has not been created by them, then I guess that a new policy would be needed.

I think it's correct to use ResponsePolicyV1.update for now.

jfcalvo

I have added some questions that you can answer at your convenience and also included some suggestions to help improve the solution. Please feel free to respond to the questions whenever you have time.

jfcalvo · 2023-04-24T09:52:37Z

src/argilla/server/alembic/versions/e402e9d9245e_create_responses_table.py

@@ -36,6 +36,7 @@ def upgrade() -> None:
        sa.Column("values", sa.JSON, nullable=False),
        sa.Column("record_id", sa.Uuid, sa.ForeignKey("records.id", ondelete="CASCADE"), nullable=False, index=True),
        sa.Column("user_id", sa.Uuid, sa.ForeignKey("users.id", ondelete="SET NULL"), index=True),
+        sa.Column("status", sa.String, nullable=False, index=True),


Can you explain your rationale for modifying this database migration instead of opting to create a new one?

So in the current latest version 1.6.0, only three tables have been created and managed by alembic workspaces, users, and workspaces_users.

In the branch feat/2615-instructions-task, migrations files have been created for some other tables. I've opted to modify the migration file instead of generating a new one because I've assumed that as these changes have not been used yet in any production environment, it is better to modify the migration file directly to add this new column, instead of creating a new migration file for just adding one column. One more file, but a little bit more noise.

Of course, if the migrations files in feat/2615-instruction-task were already applied in any production DB, then the logical thing to do would have been to generate a new migration script to add this new column, so alembic would have detected that there is a revision that has not been applied and that it needs to apply to the DB.

Having that said, for me it was not super clear how are you planning to handle the migrations with alembic. Maybe it's a good idea to have one single alembic revision/script to update the DB per the released version of the package.

Also, I've realized that alembic revision --autogenerate command cannot be used to generate migrations automatically because

argilla/src/argilla/server/alembic/env.py

Line 37 in 5e3ecd7

target_metadata = None

the target_metadata has not been set to

argilla/src/argilla/server/database.py

Line 51 in 5e3ecd7

class Base(DeclarativeBase):

Base.metadata.

Another doubt that I have, is that I've seen that the attributes nullable, index, etc are being set in the migrations files, but not in the mapped_column function of sqlalchemy.

Also, I've realized that alembic revision --autogenerate command cannot be used to generate migrations automatically

In your opinion, what are the benefits of utilizing automatic migrations?

Another doubt that I have, is that I've seen that the attributes nullable, index, etc are being set in the migrations files, but not in the mapped_column function of sqlalchemy.

In the same vein as the previous question, what are your views on the advantages of including these attributes in models?

I think that I can answer both questions at the same time 😁

So if we include these attributes in the models, then alembic will be available to automatically pick up them and it will generate a statement in the migration script with all the columns constraints, index, etc.

I think that it's really beneficial to use alembic automatic migrations, because alembic is able to automatically detect the updates you have done in your model class, or in other words, it is able to generate a diff between the schema generated with all the previous revisions and the one you're about to generate, and it will only create statements for the things that need to be updated.

So in my opinion, it is better to let alembic generate the migration file automatically with the SQLAlchemy model metadata than to let a human generate it manually and increase the chances of forgetting to include something in the migration script.

src/argilla/server/models.py

tests/server/api/v1/test_responses.py

jfcalvo · 2023-04-24T10:23:31Z

src/argilla/server/schemas/v1/responses.py


 class Response(BaseModel):
    id: UUID
    values: Dict[str, Any]
    record_id: UUID
    user_id: UUID
+    status: ResponseStatus


In src/argilla/server/schemas/v1/datasets.py, there exists a distinct Response schema. Would you suggest including the status field in this schema as well?

Oh, I didn't realize there was this Response schema. Yes, we should include the status field here too, otherwise when calling the listing records /api/v1/datasets/{dataset_id}/records the status for each response won't be there, and the frontend would have to send an additional request to get the status for getting the responses status, which is not ideal.

I think it makes sense to have a ResponseBase class in src/argilla/server/schemas/v1/responses.py, this way we avoid code duplication.

We can create a Responseclass inheriting from ResponseBase class with additional attributes that have to be exposed when listing responses (like the record_id) in src/argilla/server/schemas/v1/responses.py.

We can create a Response class inheriting from ResponseBase with no additional attributes that can be used when listing records in src/argilla/server/schemas/v1/datasets.py.

gabrielmbmb · 2023-04-25T11:30:34Z

hey @jfcalvo, I've changed the status of the PR to ready for review. Is there anything else missing that I should add?

jfcalvo · 2023-04-25T12:44:05Z

hey @jfcalvo, I've changed the status of the PR to ready for review. Is there anything else missing that I should add?

Thank you, @gabrielmbmb, for your answers. The PR looks fine, and @frascuchon should be reaching out to you soon for the next steps.

gabrielmbmb added 3 commits April 23, 2023 10:15

Add Response.status column

f5c433c

Add PUT /api/v1/responses/{response_id}/status endpoint

18951a8

Add unit tests for PUT /api/v1/response/{response_id}/status endpoint

d56b875

jfcalvo self-requested a review April 24, 2023 07:34

Merge branch 'feat/2615-instructions-task' into feat/response-status-…

a9bb122

…endpoint

jfcalvo requested changes Apr 24, 2023

View reviewed changes

gabrielmbmb added 3 commits April 24, 2023 13:38

Update to display ResponseStatus enum str value

2977dca

Add setting Response invalid status unit test

6ba0060

Update unit tests to check response body

2366e83

gabrielmbmb marked this pull request as ready for review April 25, 2023 08:34

alvarobartt linked an issue Apr 25, 2023 that may be closed by this pull request

Allow discarding/submitting responses for a record #2733

Closed

frascuchon deleted the branch argilla-io:feat/2615-instructions-task May 24, 2023 22:36

frascuchon closed this May 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add v1 enpoint to set `Response` status #2756

feat: add v1 enpoint to set `Response` status #2756

gabrielmbmb commented Apr 23, 2023

gabrielmbmb commented Apr 23, 2023 •

edited

Loading

jfcalvo commented Apr 24, 2023

jfcalvo left a comment

jfcalvo Apr 24, 2023

gabrielmbmb Apr 24, 2023 •

edited

Loading

jfcalvo Apr 24, 2023

gabrielmbmb Apr 24, 2023

jfcalvo Apr 24, 2023

gabrielmbmb Apr 24, 2023 •

edited

Loading

gabrielmbmb commented Apr 25, 2023

jfcalvo commented Apr 25, 2023

feat: add v1 enpoint to set Response status #2756

feat: add v1 enpoint to set Response status #2756

Conversation

gabrielmbmb commented Apr 23, 2023

Description

gabrielmbmb commented Apr 23, 2023 • edited Loading

jfcalvo commented Apr 24, 2023

jfcalvo left a comment

Choose a reason for hiding this comment

jfcalvo Apr 24, 2023

Choose a reason for hiding this comment

gabrielmbmb Apr 24, 2023 • edited Loading

Choose a reason for hiding this comment

jfcalvo Apr 24, 2023

Choose a reason for hiding this comment

gabrielmbmb Apr 24, 2023

Choose a reason for hiding this comment

jfcalvo Apr 24, 2023

Choose a reason for hiding this comment

gabrielmbmb Apr 24, 2023 • edited Loading

Choose a reason for hiding this comment

gabrielmbmb commented Apr 25, 2023

jfcalvo commented Apr 25, 2023

feat: add v1 enpoint to set `Response` status #2756

feat: add v1 enpoint to set `Response` status #2756

gabrielmbmb commented Apr 23, 2023 •

edited

Loading

gabrielmbmb Apr 24, 2023 •

edited

Loading

gabrielmbmb Apr 24, 2023 •

edited

Loading