Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to overwrite UploadFile impl #697

Closed
canni opened this issue Oct 29, 2019 · 7 comments
Closed

Add option to overwrite UploadFile impl #697

canni opened this issue Oct 29, 2019 · 7 comments
Labels
feature New feature or request

Comments

@canni
Copy link

canni commented Oct 29, 2019

In our case we're streaming in-flight uploaded files directly to MongoDB GridFS,
so if we lose/crash some asgi worker process that is actually using the uploaded yet unprocessed file, a schedule job can detect this and resume operation. With SpooledTemporaryFile this is impossible because we can't guarantee that resume job will land in the same container.

This all works well with Starlette, only drawback is that we need to monkey-patch UploadedFile class with our implementation, that sends chunks to Mongo's GridFS.

Can Startlette have UploadedFile class configurable?

@gvbgduh
Copy link
Member

gvbgduh commented Oct 30, 2019

@canni there's a similar question in #388 if it can be of any help.

@tomchristie
Copy link
Member

And yeah, configurable request parsers is absolutely something we want to support. Customising handling upload files will be a subset of that.

@tomchristie tomchristie added the feature New feature or request label Nov 6, 2019
@simonw
Copy link
Contributor

simonw commented Nov 25, 2019

I have a similar use-case for this. I want my users to be able to upload a CSV file (potentially 100+ MBs) and have my app then process that CSV file without blocking the event loop.

I might do that in a separate process via a message queue - but that means I need a clean way to say "don't delete this file at the end of the request, instead rename it to somewhere known so that the other process can see it"

I can do that by copying the uploaded data to a new file but for 100MB it feels it would be more efficient to take the temporary file that was already written to disk during the upload and atomically mv it to somewhere else.

@tomchristie
Copy link
Member

I want my users to be able to upload a CSV file (potentially 100+ MBs) and have my app then process that CSV file without blocking the event loop.

The upload parsing itself won't block anything. You probably want to then just handle processing the upload file in a background task https://www.starlette.io/background/ (Ie. the response has been sent over the ASGI interface, but the task itself can still continue to run and perform some more work before it terminates.)

@nathanlcarlson
Copy link

The Elasticsearch client provides a handy way of specifying your own (de)serializer, in this case for JSON, but a similar model could adopted.
https://elasticsearch-py.readthedocs.io/en/stable/#custom-serializers
Here is a link to some relevant source code:
https://github.com/elastic/elasticsearch-py/blob/master/elasticsearch/serializer.py

@tomchristie
Copy link
Member

Going to close this off for now with the same justification as #849

Out of scope. If ya need more complex parsing than we currently provide, you need to do so within your own project.

@zevisert
Copy link

zevisert commented Nov 4, 2022

Bump here from discussion #1934 - I can provide my own complex parsing, but there's overhead that I can't get rid of by having to read the spooled file from start to finish a second time. I hoped to be able to have starlette use a custom UploadFile subclass, as that would be a great place for us to add the logic we wanted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants