Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace routerify in file transfer server with axum #2461
Replace routerify in file transfer server with axum #2461
Changes from 10 commits
4051213
46f3573
e5861c0
12cbcf3
a2eaf4d
ca19392
62dd8f5
63f7f9a
8a920e3
09da1df
22995e5
93aa002
d7a9445
1ca42b2
fd5a907
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This refactoring is incidental, I noticed the test had both a hard sleep (so it was comparatively slow) and not terribly deterministic (no guarantee server one would attempt to bind first, no guarantee the port binding would fail within 100ms). The timeout-based approach gives a lot more leniency to the servers to fail, and polling
server_one
first ensures that it binds to a port first (assuming that is the first thing it attempts to do, which I think is a reasonable assumption).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In response to #2461 (comment), I've now changed this test again, and as a result the remaining complexity around which port binds first (i.e. what I'm relying on the
biased
directive for) has disappeared entirely.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: This will log an error not only when a request is handled, but every time
.into_response()
is called.Axum has a
tracing
feature that makes it output traces when handling events. Could it be used for logging errors like we try to do here?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure axum's
tracing
is quite what we want here. Here, I want to determine whether I'm hiding information from the user (which is when we have an internal server error), and if so, log that information. This avoids polluting the server logs with errors arising from clients misusing the API, but means we don't expose implementation details to the user of the API. If we rely on some sort of middleware for logging, either it would take an input ofFileTransferRequestError
, in which case we might as well do everything here as we'd still need to log selectively, or we'd take the converted response, by which point we know the status code, so it's easy to log selectively, but we've thrown away the pertinent information (e.g. an IO error).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these really want to be
403 Forbidden
(it's what the code did previously)? It feels like an encapsulation failure to recognise for certain paths that we can't delete/upload to them.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO "could not delete path due to IO error" could happen due e.g. path not existing, in which case we need to return 404, or due to server not having permissions to do filesystem delete, in which case we should return 500, so if I'm not wrong, then even using a single return code for all IO errors is kinda bad. As for
403 Forbidden
, I also feel that it should be used when the client that made the request doesn't have permissions to do something, which doesn't seem to be the case here.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that was roughly what I was thinking. We currently handle the not found case by returning
202 Accepted
, like the successful deletion case (which I assume is to makeDELETE
requests idempotent). Returning404 Not Found
may also make sense, or if the file previously existed,410 Gone
, although a success response tells the user the file definitely no longer exists, and the request definitely succeeded.I think if we've hit the case handled by this code, it's some error accessing the file system, which in my mind should definitely be a
500 Internal Server Error
. This code may also be caused by trying to delete a directory, which I talk about in a different comment.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
403 doesn't look bad, as the spec mentions the following:
The access is tied to the application logic, such as insufficient rights to a resource.
which is the case here, right? I vaguely remember a similar discussion happening when the original author introduced this case in the first place and we settled with 403 for the same reason.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what application logic is relevant here. With the current API at least, we have no control over permissions. If there's a permissions error on the file system, that implies some other process is interfering with /var/tedge/file-transfer, which is not an application-level error, it's an error from the environment. The question here mostly boils down to whether we want to assume it's some file that should have never been there (in which case encapsulating the error entirely and responding with 404 Not Found is appropriate, or whether we assume the file was uploaded correctly, but something as changed the internal permissions so we can't access it, in which case a 500 Internal Server Error is appropriate.
Thinking more broadly, a useful litmus test to apply here would be "if one user receives a 403 for a resource, there must exist some other user that is able to access the resource in question". If not, and the 403 response is global, then I don't believe this can be considered an authorization issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@didier-wenzek @reubenmiller do you have opinions on this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleting a resource that doesn't exist is definitely not an error. Returning 202 in such a case makes the request idempotent.
Trying to delete a non-empty directory, should be considered as a user error as we don't want to support recursive deletes. So 400 makes sense. I would do the same for empty-directories.
If the error is due to missing access, I would say this is a 500 as the file-transfer service should own all the files under file transfer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From RFC 9110:
If we get an IO error due to missing file permissions, I would say that the server tried to execute the request, but failed, due to a misconfiguration (e.g. not having write access to
file-transfer
directory), in which case 500 would be fitting.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if it is a directory, 404 makes more sense as it the error is in the URL and the issue is that the resource in question cannot be interacted with (and essentially doesn't exist), not that the request is inherently invalid. I'll make sure the error message clarifies that the file we attempted to delete is a directory.
Other than that, it seems like there is now broad consensus that 500 is the appropriate response for an IO error, so I'll sort that out too.