Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File access security guide #9156

Merged
merged 8 commits into from
Aug 22, 2024
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/tired-moons-tell.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
"website": minor
---

feat:File access security guide
2 changes: 1 addition & 1 deletion guides/03_building-with-blocks/06_custom-CSS-and-JS.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ with gr.Blocks(css=".gradio-container {background: url('file=clouds.jpg')}") as
...
```

Note: By default, files in the host machine are not accessible to users running the Gradio app. As a result, you should make sure that any referenced files (such as `clouds.jpg` here) are either URLs or allowed via the `allow_list` parameter in `launch()`. Read more in our [section on Security and File Access](/guides/sharing-your-app#security-and-file-access).
Note: By default, files in the host machine are not accessible to users running the Gradio app. As a result, you should make sure that any referenced files (such as `clouds.jpg` here) are either URLs or allowed via the `allow_list` parameter in `launch()`. Read more in our [section on Security and File Access](/guides/file-access).


## The `elem_id` and `elem_classes` Arguments
Expand Down
39 changes: 1 addition & 38 deletions guides/04_additional-features/07_sharing-your-app.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,7 @@ In this Guide, we dive more deeply into the various aspects of sharing a Gradio
5. [Accessing network requests](#accessing-the-network-request-directly)
6. [Mounting within FastAPI](#mounting-within-another-fast-api-app)
7. [Authentication](#authentication)
8. [Security and file access](#security-and-file-access)
9. [Analytics](#analytics)
8. [Analytics](#analytics)

## Sharing Demos

Expand Down Expand Up @@ -409,42 +408,6 @@ if __name__ == '__main__':
There are actually two separate Gradio apps in this example! One that simply displays a log in button (this demo is accessible to any user), while the other main demo is only accessible to users that are logged in. You can try this example out on [this Space](https://huggingface.co/spaces/gradio/oauth-example).



## Security and File Access
freddyaboulton marked this conversation as resolved.
Show resolved Hide resolved

Sharing your Gradio app with others (by hosting it on Spaces, on your own server, or through temporary share links) **exposes** certain files on the host machine to users of your Gradio app.

In particular, Gradio apps ALLOW users to access to four kinds of files:

- **Temporary files created by Gradio.** These are files that are created by Gradio as part of running your prediction function. For example, if your prediction function returns a video file, then Gradio will save that video to a temporary cache on your device and then send the path to the file to the front end. You can customize the location of temporary cache files created by Gradio by setting the environment variable `GRADIO_TEMP_DIR` to an absolute path, such as `/home/usr/scripts/project/temp/`. You can delete the files created by your app when it shuts down with the `delete_cache` parameter of `gradio.Blocks`, `gradio.Interface`, and `gradio.ChatInterface`. This parameter is a tuple of integers of the form `[frequency, age]` where `frequency` is how often to delete files and `age` is the time in seconds since the file was last modified.


- **Cached examples created by Gradio.** These are files that are created by Gradio as part of caching examples for faster runtimes, if you set `cache_examples=True` or `cache_examples="lazy"` in `gr.Interface()`, `gr.ChatInterface()` or in `gr.Examples()`. By default, these files are saved in the `gradio_cached_examples/` subdirectory within your app's working directory. You can customize the location of cached example files created by Gradio by setting the environment variable `GRADIO_EXAMPLES_CACHE` to an absolute path or a path relative to your working directory.

- **Files that you explicitly allow via the `allowed_paths` parameter in `launch()`**. This parameter allows you to pass in a list of additional directories or exact filepaths you'd like to allow users to have access to. (By default, this parameter is an empty list).

- **Static files that you explicitly set via the `gr.set_static_paths` function**. This parameter allows you to pass in a list of directories or filenames that will be considered static. This means that they will not be copied to the cache and will be served directly from your computer. This can help save disk space and reduce the time your app takes to launch but be mindful of possible security implications.

Gradio DOES NOT ALLOW access to:

- **Files that you explicitly block via the `blocked_paths` parameter in `launch()`**. You can pass in a list of additional directories or exact filepaths to the `blocked_paths` parameter in `launch()`. This parameter takes precedence over the files that Gradio exposes by default or by the `allowed_paths`.

- **Any other paths on the host machine**. Users should NOT be able to access other arbitrary paths on the host.

Sharing your Gradio application will also allow users to upload files to your computer or server. You can set a maximum file size for uploads to prevent abuse and to preserve disk space. You can do this with the `max_file_size` parameter of `.launch`. For example, the following two code snippets limit file uploads to 5 megabytes per file.

```python
import gradio as gr

demo = gr.Interface(lambda x: x, "image", "image")

demo.launch(max_file_size="5mb")
# or
demo.launch(max_file_size=5 * gr.FileSize.MB)
```

Please make sure you are running the latest version of `gradio` for these security settings to apply.

## Analytics

By default, Gradio collects certain analytics to help us better understand the usage of the `gradio` library. This includes the following information:
Expand Down
65 changes: 65 additions & 0 deletions guides/04_additional-features/08_file-access.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Security and File Access

Sharing your Gradio app with others (by hosting it on Spaces, on your own server, or through temporary share links) **exposes** certain files on your machine to the internet.

This guide explains which files are exposed as well as some best practices for making sure the files on your machine are secure.

## The Gradio cache

First, it's important to understand that Gradio copies files to a `cache` directory before returning them to the frontend. This prevents files from being overwritten by one user while they are still needed by another user of your application. For example, if your prediction function returns a video file, then Gradio will move that video to the `cache` after your prediction function runs and returns a URL the frontend can use to show the video. Any file in the `cache` is available via URL to all users of your running application.

Tip: You can customize the location of the cache by setting the `GRADIO_TEMP_DIR` environment variable to an absolute path, such as `/home/usr/scripts/project/temp/`.

## The files Gradio will move to the cache

Before placing a file in the cache, Gradio will check to see if the file meets at least one of following criteria:

1. It was uploaded by a user.
2. It is in the `allowed_paths` parameter of the `Blocks.launch` method.
3. It is in the current working directory of the python interpreter.
4. It is in the temp directory obtained by `tempfile.gettempdir()`.
freddyaboulton marked this conversation as resolved.
Show resolved Hide resolved
5. It was specified by the developer before runtime, e.g. cached examples or default values of components.

Note: files in the current working directory whose name starts with a period (`.`) will not be moved to the cache, since they often contain sensitive information.

If none of these criteria are met, the prediction function that created that file will raise an exception instead of moving the file to cache. Gradio performs this check so that arbitrary files on your machine cannot be accessed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If none of these criteria are met, the prediction function that created that file will raise an exception instead of moving the file to cache. Gradio performs this check so that arbitrary files on your machine cannot be accessed.
If none of these criteria are met, the prediction function that is returning that file will raise an exception instead of moving the file to cache. Gradio performs this check so that arbitrary files on your machine cannot be accessed.


Tip: If at any time Gradio blocks a file that you would like it to process, add its path to the `allowed_paths` parameter.

## The files Gradio will allow others to access

In short, these are the files located in the `cache` and any other additional paths **YOU** grant access to via `allowed_paths` or `gr.set_static_paths`.

- **The `allowed_paths` parameter in `launch()`**. This parameter allows you to pass in a list of additional directories or exact filepaths you'd like to allow users to have access to. (By default, this parameter is an empty list).

- **Static files that you explicitly set via the `gr.set_static_paths` function**. This parameter allows you to pass in a list of directories or filenames that will be considered static. This means that they will not be copied to the cache and will be served directly from your computer. This can help save disk space and reduce the time your app takes to launch but be mindful of possible security implications.

## The files Gradio will not allow others to access

While running, Gradio apps will NOT ALLOW users to access:

- **Files that you explicitly block via the `blocked_paths` parameter in `launch()`**. You can pass in a list of additional directories or exact filepaths to the `blocked_paths` parameter in `launch()`. This parameter takes precedence over the files that Gradio exposes by default or by the `allowed_paths`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is true:

Suggested change
- **Files that you explicitly block via the `blocked_paths` parameter in `launch()`**. You can pass in a list of additional directories or exact filepaths to the `blocked_paths` parameter in `launch()`. This parameter takes precedence over the files that Gradio exposes by default or by the `allowed_paths`.
- **Files that you explicitly block via the `blocked_paths` parameter in `launch()`**. You can pass in a list of additional directories or exact filepaths to the `blocked_paths` parameter in `launch()`. This parameter takes precedence over the files that Gradio exposes by default or by the `allowed_paths` parameter or by `gr.set_static_paths`.


- **Any other paths on the host machine**. Users should NOT be able to access other arbitrary paths on the host.


## Uploading Files

Sharing your Gradio application will also allow users to upload files to your computer or server. You can set a maximum file size for uploads to prevent abuse and to preserve disk space. You can do this with the `max_file_size` parameter of `.launch`. For example, the following two code snippets limit file uploads to 5 megabytes per file.

```python
import gradio as gr

demo = gr.Interface(lambda x: x, "image", "image")

demo.launch(max_file_size="5mb")
# or
demo.launch(max_file_size=5 * gr.FileSize.MB)
```

## Best Practices

* Set a `max_file_size` for your application.
* Do not treat arbitrary user input as input to a file-based component (`gr.Image`, `gr.File`, etc.). For example, the following interface would allow anyone to move an arbitrary file in your local directory to the cache: `gr.Interface(lambda s: s, "text", "file")`. This is because the user input is treated as an arbitrary file path.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Do not treat arbitrary user input as input to a file-based component (`gr.Image`, `gr.File`, etc.). For example, the following interface would allow anyone to move an arbitrary file in your local directory to the cache: `gr.Interface(lambda s: s, "text", "file")`. This is because the user input is treated as an arbitrary file path.
* Do not return arbitrary user input from a function that is connected to a file-based output component (`gr.Image`, `gr.File`, etc.). For example, the following interface would allow anyone to move an arbitrary file in your local directory to the cache: `gr.Interface(lambda s: s, "text", "file")`. This is because the user input is treated as an arbitrary file path.

* Make `allowed_paths` as small as possible. If a path in `allowed_paths` is a directory, any file within that directory can be accessed. Ma sure the entires of `allowed_paths` only contains files related to your application.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Make `allowed_paths` as small as possible. If a path in `allowed_paths` is a directory, any file within that directory can be accessed. Ma sure the entires of `allowed_paths` only contains files related to your application.
* Make `allowed_paths` as small as possible. If a path in `allowed_paths` is a directory, any file within that directory can be accessed. Make sure the entires of `allowed_paths` only contains files related to your application.

* Run your gradio application from the same directory the application file is located in. This will narrow the scope of files Gradio will be allowed to move into the cache. For examples, prefer `python app.py` to `python Users/sources/project/app.py`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Run your gradio application from the same directory the application file is located in. This will narrow the scope of files Gradio will be allowed to move into the cache. For examples, prefer `python app.py` to `python Users/sources/project/app.py`.
* Run your gradio application from the same directory the application file is located in. This will narrow the scope of files Gradio will be allowed to move into the cache. For example, prefer `python app.py` to `python Users/sources/project/app.py`.

Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,15 @@ Environment variables in Gradio provide a way to customize your applications and
```


### 12. `GRADIO_EXAMPLES_CACHE`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah nice


- **Description**: If you set `cache_examples=True` or `cache_examples="lazy"` in `gr.Interface()`, `gr.ChatInterface()` or in `gr.Examples()`, Gradio will run your prediction function and save the results to disk. By default, this is in the `gradio_cached_examples/` subdirectory within your app's working directory. You can customize the location of cached example files created by Gradio by setting the environment variable `GRADIO_EXAMPLES_CACHE` to an absolute path or a path relative to your working directory.
- **Default**: `"gradio_cached_examples/"`
- **Example**:
```sh
export GRADIO_EXAMPLES_CACHE="custom_cached_examples/"
```

## How to Set Environment Variables

To set environment variables in your terminal, use the `export` command followed by the variable name and its value. For example:
Expand All @@ -124,3 +133,5 @@ GRADIO_SERVER_NAME="localhost"

Then, use a tool like `dotenv` to load these variables when running your application.



3 changes: 2 additions & 1 deletion js/_website/src/routes/redirects.js
Original file line number Diff line number Diff line change
Expand Up @@ -173,5 +173,6 @@ export const redirects = {
"/docs/no-reload": "/docs/gradio/NO_RELOAD",
"/docs/python-client/python-client": "/docs/python-client/introduction",
"/docs/python-client/js-client": "/docs/js-client",
"/docs/gradio/interface#interface-queue": "/docs/gradio/interface"
"/docs/gradio/interface#interface-queue": "/docs/gradio/interface",
"/guides/sharing-your-app#security-and-file-access": "/guides/file-access"
};
Loading