-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use yaml.safe_dump
and _load
?
#26
Comments
Thank you, @bollwyvl! This is super helpful! I'll work on a PR later today to address these issues. |
Python example of this can be found in TF: GHSA-r6jx-9g48-2r5r |
@bollwyvl Wanted some more clarification from you here. Would you agree in saying this issue is specific to a YAML engine implementation? It doesn't look like the YAML spec requires an implementation to support custom tags that allow for ACE. The required tags are for very common types (e.g. dict, str, float) and seem to be safe as long as the engine is using the language's standard library and is safe. It seems the issue stems from the fact most implementations choose to have ACE via tags available by default for whatever reason, rather than stemming from the YAML spec itself. |
Yes, indeed it's implementation-driven, but also use case driven. Throughout the jupyter ecosystem, YAML has had a lot of push-back as an interchange and configuration format because of these security gotchas around untrusted input, performance, and the complexity of adopting one of the implementations: for a few years, the primary one just didn't work very well on windows. The deployment and performance situation has improved, but the fact that, as a default, it will allow calling any-old-class is not That being said: for hand-writing JSON schema, YAML is lovely, especially for things like anchors, with decent tooling e.g. If the only purpose is for loading configurations in this library is to be then be handed over to instantiate a JSON Schema validator, none of the none-core JSON types are of particular value, and If events are going to be flowing as YAML (which is a bad idea for myriad reasons), then best-effort-at-security seems like a requirement, rather than a sensible default. At any rate: starting with pyyaml's best-effort-at-security |
Agreed. We should not follow the existing convention of being unsafe by default 🙈 |
@Zsailer Sorry, forgot to specify "fixes #XX" instead to auto-close. |
No problem. Thanks, @dlqqq. Closing! |
Using the default
load
anddump
features of pyyaml can have unintended consequences, as they can use YAML tags to execute arbitrary code.For an application, this is great as it provides a low-code way to use complex structures.
For a library, this is kinda bad, as it means any yaml source is immediately an arbitrary code execution engine.
Likely, this would just need
from yaml import safe_dump, safe_load
.Additionally, when reading yaml, using
read_text(encoding="utf-8")
is a good way to avoid future issues.The text was updated successfully, but these errors were encountered: