-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hide sensitive data in UI #8421
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! |
Airflow 1.10.10 allows getting connections from a Vault: https://airflow.apache.org/blog/airflow-1.10.10/#allow-retrieving-airflow-connections-variables-from-various-secrets-backend Does that help your use-case? |
So basically, with Airflow 1.10.10, if I configure the airflow.cfg to use Hashicorp Vault, I can use connections and variables as usual but instead of getting data from Airflow database, it will got it from Vault? |
Exactly. Here is one of the guide: https://www.astronomer.io/guides/airflow-and-hashicorp-vault/ to test it out locally and following docs: |
@kaxil I'm trying to implement a very similar workflow using the new Could we take a similar approach to |
aah I see, apologies @n4rk0o I should have read your description more carefully. Yes the Rendered UI Field currently exposes everything. We should have a way of hiding this. I see two options here:
|
I do like the idea of having this be an optional behavior I think it would be really great if we could selectively obscure Variables too - some are sensitive while others are not, which is why I like the approach in #1530 of using a list of patterns Also, I noticed that Variables can be leaked via the |
100% agree. Would you like to work on a proposal or a PR for that one, it would be definitely good to get this in for 2.0 |
Hmm it's been awhile since I've worked on the UI I'll do some digging in the code and see if I can come up with a reasonable design... Do you happen to know off-hand if there is any way to tell at run-time whether DAG code is being compiled by the webserver? Because we need the Variable's I'm also not sure how this would work with serialized DAGs: once they're serialized they will have the unmasked values, so if the webserver just reads the serialized DAG directly we'd have no way to know whether to mask certain Variables since they would already be just plain values Maybe we need a more dynamic way of injecting Variables rather than reading them at DAG compile time and writing their values into the DB? I'm thinking some kind of Variable pointer syntax like |
This should happen before the templates are rendered I think for hiding sensitive info in Rendered Fields |
I have a similar use case, but passing the password from a connection object I created in the UI to the environment variable in the KubernetesPodOperator and it is appearing in plain text in the Rendered Template part of the UI for the task. There should be a way to avoid this being printed and visible. |
I should have some time to work on a proposal for this in the next few weeks, but I'm still not sure the best way to approach the design The key question I'm stuck on is at what point the templates get filled in... obviously the executors need to have access to the unmasked values, but if they get injected at DAG compile time then they will be visible in the UI, so I'm guessing the unmasked injection needs to happen at task execution time |
+1 @marcusianlevine have you succeeded on implementing this? I can also help if needed. |
I’m thinking about a combination of the two approaches:
All proposed names are subject to discussion. I’m going to look into the implementation. |
WIP PR at #15151. |
@uranusjr Actually how about controlling the template_field view via FAB Permissions? That will allow Admins (and the roles that they allow) to still check those values for debugging but not all. |
Makes sense as well. Do you think there should be some kind of sensitive field naming convention (e.g. leading underscore as suggested previously) and separate permissions to control “public” and sensitive field visibilities, or just one blanket permission for all fields would suffice? |
re: Naming convention The problem with that is that it is too big of a change and affects all the Operators as airflow/airflow/operators/bash.py Lines 127 to 146 in 4d1b2e9
and that it varies sometimes per user, as they might pass a sensitive value to BashOperator via a connection/variable to use an environment variable containing sensitive information. Hence I think the blanket permission for all field should suffice |
Alternative solution based on permission is in #15158. |
I have a similar need that might not be covered by the solutions posed in this thread: edit: I see there is a private_environment in the DockerOperator plugin for 2.x, so I am copying that functionality by overriding the |
Let's include hiding sensitive data everywhere including logs & rendered templated fields |
What needs to be done to hide sensitive data in logs? There’s already permission to hide task logs ( |
See #15599 |
when you use vault to get the secret then before passing as params you can mask it. It will display as **** in log as well as rendered template. you can use the below code to mask the secret from the Vault. from airflow.utils.log.secrets_masker import mask_secret openssl_service_account_key_read_response = client.secrets.kv.read_secret_version(path=openssl_service_account_secret_path,mount_point=vault_mount_point)
|
Description
I'm using Airflow for 2 years now and I have a plugin that get password for a specific account in a Vault and then push it through a XCOM to reuse it on another tasks.
The fact is that if the value is sensitive like a password, I can't hide it in the UI except for XCOM if I add an underscore in the prefix name of the key value.
Eg: kwargs['ti'].xcom_push('key':'_password', 'value':'my_value')
But for rendered template UI page, I didn't find anything similar, so if I try to pull a XCOM, it will show the value in the UI and I want to avoid it.
Maybe is it possible to add a condition in https://github.com/apache/airflow/blob/master/airflow/www/views.py after line 635
Use case / motivation
I know that I can use connections but in my case, and due to security politic in my company, we have to store it in a dedicated Vault.
Related Issues
N/A
The text was updated successfully, but these errors were encountered: