-
-
Notifications
You must be signed in to change notification settings - Fork 934
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: expose multipart headers to users #1311
Conversation
starlette/datastructures.py
Outdated
_KT = typing.TypeVar("_KT", bound=typing.Hashable) | ||
_VT = typing.TypeVar("_VT") | ||
_T = typing.TypeVar("_T") | ||
|
||
|
||
_UNSET: typing.Any = object() | ||
|
||
|
||
class ImmutableMultiDict(typing.Mapping[_KT, _VT]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making ImmutableMultiDict a generic is to be able to properly type hint
starlette/starlette/datastructures.py
Line 682 in 9f6e890
headers: typing.Optional[ImmutableMultiDict[str, Headers]] |
starlette/datastructures.py
Outdated
""" | ||
An immutable multidict, containing both file uploads and text input. | ||
""" | ||
headers: typing.Optional[ImmutableMultiDict[str, Headers]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where users will access the headers.
I had to make a choice here: create a class MultipartFormData(FormData): ...
subclass and add headers
there or make headers
Optional (because they are not present in application/x-www-form-urlencoded
requests).
I chose the latter, but both should work and be backwards compatible. The former is just more code changes, and results in users having to do isinstance(await request.form(), MultiPartFormData)
vs. (await request.form()).headers is not None
before accessing headers.
tests/test_formparsers.py
Outdated
@@ -29,6 +29,8 @@ async def app(scope, receive, send): | |||
else: | |||
output[key] = value | |||
await request.close() | |||
if data.headers is not None: | |||
output["__headers"] = [(field_name, dict(headers.items())) for field_name, headers in data.headers.multi_items()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also return a dict w/ 2 keys in every test instead of using a __headers
key. That does mean all of the tests that weren't affected by this change would also need to be edited tho.
starlette/datastructures.py
Outdated
@@ -665,3 +647,35 @@ def __getattr__(self, key: typing.Any) -> typing.Any: | |||
|
|||
def __delattr__(self, key: typing.Any) -> None: | |||
del self._state[key] | |||
|
|||
|
|||
class FormData(ImmutableMultiDict[str, typing.Union[str, UploadFile]]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved this under Headers
to avoid the forward ref. We can also just from __future__ import annotations
if necessary and void moving the code block.
It is also possible to implement this as: class FormData(ImmutableMultiDict[str, typing.Union[str, UploadFile]]):
_headers: typing.Optional[ImmutableMultiDict[str, Headers]]
def __init__(
self,
*args: typing.Union[
"FormData",
typing.Mapping[str, typing.Union[str, UploadFile]],
typing.List[typing.Tuple[str, typing.Union[str, UploadFile]]],
],
raw_headers: typing.Optional[
typing.List[typing.Tuple[str, typing.List[typing.Tuple[bytes, bytes]]]]
] = None,
) -> None:
super().__init__(*args)
if raw_headers is not None:
self._headers = ImmutableMultiDict(
[(field_name, Headers(raw=raw)) for field_name, raw in raw_headers]
)
else:
self._headers = ImmutableMultiDict(
[(field_name, Headers()) for field_name, _ in self.multi_items()]
)
def multi_items_with_headers(self) -> typing.Iterable[typing.Tuple[str, typing.Union[str, UploadFile], Headers]]:
return (
(field_name, item, headers)
for (field_name, item), (_, headers)
in zip(self.multi_items(), self._headers.multi_items()) # type: ignore # the return time in ImmutableMultiDict is hardcoded to tuple[str, str]
)
... I think there's pros and cons to either approach. |
@tomchristie to elaborate on #1311 (comment), the currently proposed API relies on dict ordering in Python. There is an alternative which is to add a method that still relies on dict ordering internally, but transfers that conceptual leap from users to implementation: Nowform = FormData(...)
for (name, form), (_, headers) in zip(form.multi_items(), form.headers.multi_items()):
... With a wrapper methodform = FormData(...)
for name, form, headers in zip(form.multi_items_with_headers()):
... |
That's an interesting comment, thanks. I think I've dropped the ball slightly here, because my first questions should have been:
|
If anything Tom, I dropped the ball by not leading with that. First, a bit of general HTTP context is appropriate. I'm sure you are familiar with this but others reading this PR may not be. So generally the proposed API here is "a way for users to get access to the HTTP headers included with each multipart field". I think that given the current Starlette APIs, the two concrete options we have are those in #1311 (comment) As for how Flask and Django do it, let me start with Flask. Here are examples for both: import io
import httpx
from flask import Flask, request
from werkzeug.datastructures import FileStorage, Headers
app = Flask(__name__)
@app.route('/', methods=['POST'])
def upload_file():
file1 = request.form["file1"]
assert isinstance(file1, str)
assert file1 == "{}"
file2 = request.files["file2"]
assert isinstance(file2, FileStorage)
assert isinstance(file2.headers, Headers)
assert file2.headers["Content-Type"] == "text/plain"
assert next(iter(file2)) == b"123"
return ("", 200)
with httpx.Client(app=app, base_url="http://testapp") as client:
resp = client.post(
"/",
files=[
("file1", (None, b"{}", "application/json")),
("file2", ("file", io.BytesIO(b"123"), "text/plain")),
],
)
assert resp.status_code == 200 import asyncio
import io
import httpx
from starlette.applications import Starlette
from starlette.datastructures import UploadFile
from starlette.requests import Request
from starlette.responses import Response
from starlette.routing import Route
async def upload_file(request: Request):
form = await request.form()
file1 = form["file1"]
assert isinstance(file1, str)
assert file1 == "{}"
file2 = form["file2"]
assert isinstance(file2, UploadFile)
assert await file2.read() == b"123"
return Response()
app = Starlette(routes=[Route("/", upload_file, methods=["POST"])])
async def test():
async with httpx.AsyncClient(app=app, base_url="http://testapp") as client:
resp = await client.post(
"/",
files=[
("file1", (None, b"{}", "application/json")),
("file2", ("file", io.BytesIO(b"123"), "text/plain")),
],
)
assert resp.status_code == 200
asyncio.run(test()) But there is a problem here with Flask's API: it is totally possible to include headers in a non-file field ( My goal is that there is no loss of information, or we would be excluding many valid use cases. Hence my API proposals retain this information in all cases, even when the upload is not a file / has not filename. We could also copy these headers to the UploadFile, which would make the more common use case simpler and give us a nice parity of API with Flask. We can even redirect the current I'll look a bit into Django, but I am less familiar. |
Right, that makes sense. So my thoughts here are that we almost certainly don't care about headers for non-files. I think it's one of those this-can-feasibly-exist-but-isn't-actually-any-use-in-practice-or-ever-actually-seen-in-the-wild kinda cases. |
I see your point, but I'll have to respectfully disagree. If Starlette didn't consume this data, I would agree with you because the (admittedly fewer) users who might want the headers would have an escape hatch. But since Starlette consumes this data and discards it, there is no escape hatch. I know I've wanted to do this in my work, I'll try to see if I can find any public examples. This said, I am happy to make a separate PR to add the full headers to |
Smart approach, yup. |
I'm going to close this PR since #1382 solves most of the use case for me. Thanks @tomchristie ! |
From @tomchristie on Gitter
I'll try to make a PR to httpx at some point for this as well