-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File access security guide #9156
Conversation
Add code Add code Add code emphasis
🪼 branch checks and previews
Install Gradio from this PR pip install https://gradio-pypi-previews.s3.amazonaws.com/cab9ba13ebec3bbe56ce358ab2ec7802b0b28240/gradio-4.42.0-py3-none-any.whl Install Gradio Python Client from this PR pip install "gradio-client @ git+https://github.com/gradio-app/gradio@cab9ba13ebec3bbe56ce358ab2ec7802b0b28240#subdirectory=client/python" Install Gradio JS Client from this PR npm install https://gradio-npm-previews.s3.amazonaws.com/cab9ba13ebec3bbe56ce358ab2ec7802b0b28240/gradio-client-1.5.1.tgz |
🦄 change detectedThis Pull Request includes changes to the following packages.
With the following changelog entry.
Maintainers or the PR author can modify the PR title to modify this entry.
|
|
||
Sharing your Gradio app with others (by hosting it on Spaces, on your own server, or through temporary share links) **exposes** certain files on your machine to the internet. | ||
|
||
This guide will explain which ones as well as some best practices for making sure the files on your machine are secure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This guide will explain which ones as well as some best practices for making sure the files on your machine are secure. | |
This guide explains which files are exposed as well as best practices for making sure the files on your machine are secure. |
|
||
First, it's important to understand that Gradio places files in a special `cache` before returning them to the frontend. For example, if your prediction function returns a video file, then Gradio will move that video to the `cache` after your prediction function runs and returns a URL the frontend can use to show the video. Any file in the `cache` is available via URL while the application is running. | ||
|
||
Tip: You can customize the location of the `cache` by setting the `GRADIO_TEMP_DIR` environment variable to an absolute path, such as `/home/usr/scripts/project/temp/`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tip: You can customize the location of the `cache` by setting the `GRADIO_TEMP_DIR` environment variable to an absolute path, such as `/home/usr/scripts/project/temp/`. | |
Tip: You can customize the location of the cache by setting the `GRADIO_TEMP_DIR` environment variable to an absolute path, such as `/home/usr/scripts/project/temp/`. |
3. It is in the current working directory of the python interpreter. | ||
4. It is in the temp directory obtained by `tempfile.gettempdir()`. | ||
|
||
Additionally, files in the current working directory whose name starts with a period (`.`) will not be moved to the cache. If no criteria are met, the prediction function that created that file will error. Gradio performs this check so that arbitrary files on your machine are not moved to the cache. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, files in the current working directory whose name starts with a period (`.`) will not be moved to the cache. If no criteria are met, the prediction function that created that file will error. Gradio performs this check so that arbitrary files on your machine are not moved to the cache. | |
Note: files in the current working directory whose name starts with a period (`.`) will not be moved to the cache, since they often contain sensitive information for your application. | |
If none of these criteria are met, the prediction function that created that file will raise an exception instead of moving the file to cache. Gradio performs this check so that arbitrary files on your machine cannot be accessed. |
* Set a `max_file_size` for your application. | ||
* Do not treat arbitrary user input as input to a file-based component (`gr.Image`, `gr.File`, etc.). | ||
* Prefer to use absolute paths in `allowed_paths`. If a path in `allowed_paths` is a directory, any file within that directory can be accessed. If passing a directory is necessary, make sure it only contains files related to your application. | ||
* Run your gradio application from the same directory the application file is located in. This will narrow the scope of files Gradio will be allowed to move into the cache. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also might be worth adding a quick example:
python app.py
instead of
python Users/.../dev/app.py
|
||
* Set a `max_file_size` for your application. | ||
* Do not treat arbitrary user input as input to a file-based component (`gr.Image`, `gr.File`, etc.). | ||
* Prefer to use absolute paths in `allowed_paths`. If a path in `allowed_paths` is a directory, any file within that directory can be accessed. If passing a directory is necessary, make sure it only contains files related to your application. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like you're connecting
Prefer to use absolute paths in
allowed_paths
.
and
If a path inallowed_paths
is a directory, any file within that directory can be accessed. If passing a directory is necessary, make sure it only contains files related to your application.
but these are two independent points iiuc. I would separate them into two separate bullet points
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point I was trying to make is that if you add some_big_dir/
to allowed_paths
, everything in it is exposed. I'm suggesting that the directories in allowed_paths
are "as small as possible".
@@ -107,6 +107,15 @@ Environment variables in Gradio provide a way to customize your applications and | |||
``` | |||
|
|||
|
|||
### 12. `GRADIO_EXAMPLES_CACHE` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah nice
Went a bit overboard with the suggestions sorry @freddyaboulton 😅, take whatever sounds reasonable. Also there's one or two places in the Guides where we link to |
Thanks @abidlabs - will address in a bit! |
Should be good for another review @abidlabs ! |
|
||
Note: files in the current working directory whose name starts with a period (`.`) will not be moved to the cache, since they often contain sensitive information. | ||
|
||
If none of these criteria are met, the prediction function that created that file will raise an exception instead of moving the file to cache. Gradio performs this check so that arbitrary files on your machine cannot be accessed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If none of these criteria are met, the prediction function that created that file will raise an exception instead of moving the file to cache. Gradio performs this check so that arbitrary files on your machine cannot be accessed. | |
If none of these criteria are met, the prediction function that is returning that file will raise an exception instead of moving the file to cache. Gradio performs this check so that arbitrary files on your machine cannot be accessed. |
|
||
While running, Gradio apps will NOT ALLOW users to access: | ||
|
||
- **Files that you explicitly block via the `blocked_paths` parameter in `launch()`**. You can pass in a list of additional directories or exact filepaths to the `blocked_paths` parameter in `launch()`. This parameter takes precedence over the files that Gradio exposes by default or by the `allowed_paths`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is true:
- **Files that you explicitly block via the `blocked_paths` parameter in `launch()`**. You can pass in a list of additional directories or exact filepaths to the `blocked_paths` parameter in `launch()`. This parameter takes precedence over the files that Gradio exposes by default or by the `allowed_paths`. | |
- **Files that you explicitly block via the `blocked_paths` parameter in `launch()`**. You can pass in a list of additional directories or exact filepaths to the `blocked_paths` parameter in `launch()`. This parameter takes precedence over the files that Gradio exposes by default or by the `allowed_paths` parameter or by `gr.set_static_paths`. |
## Best Practices | ||
|
||
* Set a `max_file_size` for your application. | ||
* Do not treat arbitrary user input as input to a file-based component (`gr.Image`, `gr.File`, etc.). For example, the following interface would allow anyone to move an arbitrary file in your local directory to the cache: `gr.Interface(lambda s: s, "text", "file")`. This is because the user input is treated as an arbitrary file path. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Do not treat arbitrary user input as input to a file-based component (`gr.Image`, `gr.File`, etc.). For example, the following interface would allow anyone to move an arbitrary file in your local directory to the cache: `gr.Interface(lambda s: s, "text", "file")`. This is because the user input is treated as an arbitrary file path. | |
* Do not return arbitrary user input from a function that is connected to a file-based output component (`gr.Image`, `gr.File`, etc.). For example, the following interface would allow anyone to move an arbitrary file in your local directory to the cache: `gr.Interface(lambda s: s, "text", "file")`. This is because the user input is treated as an arbitrary file path. |
|
||
* Set a `max_file_size` for your application. | ||
* Do not treat arbitrary user input as input to a file-based component (`gr.Image`, `gr.File`, etc.). For example, the following interface would allow anyone to move an arbitrary file in your local directory to the cache: `gr.Interface(lambda s: s, "text", "file")`. This is because the user input is treated as an arbitrary file path. | ||
* Make `allowed_paths` as small as possible. If a path in `allowed_paths` is a directory, any file within that directory can be accessed. Ma sure the entires of `allowed_paths` only contains files related to your application. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Make `allowed_paths` as small as possible. If a path in `allowed_paths` is a directory, any file within that directory can be accessed. Ma sure the entires of `allowed_paths` only contains files related to your application. | |
* Make `allowed_paths` as small as possible. If a path in `allowed_paths` is a directory, any file within that directory can be accessed. Make sure the entires of `allowed_paths` only contains files related to your application. |
* Set a `max_file_size` for your application. | ||
* Do not treat arbitrary user input as input to a file-based component (`gr.Image`, `gr.File`, etc.). For example, the following interface would allow anyone to move an arbitrary file in your local directory to the cache: `gr.Interface(lambda s: s, "text", "file")`. This is because the user input is treated as an arbitrary file path. | ||
* Make `allowed_paths` as small as possible. If a path in `allowed_paths` is a directory, any file within that directory can be accessed. Ma sure the entires of `allowed_paths` only contains files related to your application. | ||
* Run your gradio application from the same directory the application file is located in. This will narrow the scope of files Gradio will be allowed to move into the cache. For examples, prefer `python app.py` to `python Users/sources/project/app.py`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Run your gradio application from the same directory the application file is located in. This will narrow the scope of files Gradio will be allowed to move into the cache. For examples, prefer `python app.py` to `python Users/sources/project/app.py`. | |
* Run your gradio application from the same directory the application file is located in. This will narrow the scope of files Gradio will be allowed to move into the cache. For example, prefer `python app.py` to `python Users/sources/project/app.py`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM @freddyaboulton! Small comments
Thank you @abidlabs ! |
* first draft Add code Add code Add code emphasis * suggestions * redirects * add changeset * trigger ci * typos --------- Co-authored-by: gradio-pr-bot <[email protected]>
* Fix unified case * commit * Add code * add changeset * notebook * Lint * delete * Fix code * fix tests * File access security guide (#9156) * first draft Add code Add code Add code emphasis * suggestions * redirects * add changeset * trigger ci * typos --------- Co-authored-by: gradio-pr-bot <[email protected]> * redirect * typos * link * fix * See what the problem is * less time * fix * try again with busted cache * try again * Code * Demo and code --------- Co-authored-by: gradio-pr-bot <[email protected]> Co-authored-by: pngwn <[email protected]>
Description
Pull out the "File Access" section from the "Sharing your App" guide into its own guide.
Take two of #9154
Closes: #(issue)
🎯 PRs Should Target Issues
Before your create a PR, please check to see if there is an existing issue for this change. If not, please create an issue before you create this PR, unless the fix is very small.
Not adhering to this guideline will result in the PR being closed.
Tests
PRs will only be merged if tests pass on CI. To run the tests locally, please set up your Gradio environment locally and run the tests:
bash scripts/run_all_tests.sh
You may need to run the linters:
bash scripts/format_backend.sh
andbash scripts/format_frontend.sh