Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Security question #20

Closed
afarbos opened this issue Jun 11, 2015 · 4 comments
Closed

Security question #20

afarbos opened this issue Jun 11, 2015 · 4 comments

Comments

@afarbos
Copy link

afarbos commented Jun 11, 2015

I want to use s3 and I see that airflow works well with aws (ec2, s3). But s3 mean that I will have data outside. What do you have for security in airflow?

@artwr
Copy link
Contributor

artwr commented Jun 11, 2015

Hi @glodus,

s3 can be locked pretty tightly so that it is only readable with an access_key and a secret_key and not to the rest of the world (You would have to define the appropriate security policy for the resource (bucket, or subdirectory)). This is the setup we use. Airflow is also equipped to handle IAM roles, if your setup permits.
Let me know if you have more questions about this!

@afarbos
Copy link
Author

afarbos commented Jun 12, 2015

I was more thinking about what I would put on the airflow server, about the "worklow", "pipeline" or dags. It could be very unfortunate that someone understood it.

Not really related, but still a question: Do you handle LDAP authentification or perhaps flask_login already handle it ?

@mistercrunch
Copy link
Member

From our perspective security is almost an all-or-nothing as far as executing pipelines. If you do have access to an Airflow box, you pretty much have full access to all the systems Airflow interfaces with. We spoke internally about using Airflow for more sensitive datasets that not all engineers should have access to, and the solution there would be to setup an alternate Airflow environment for that purpose.

The UI is mostly a read-only UI, and there are different level of access there. It's possible to create a limited access where people can just look at pipeline definitions and how they progress.

The flask_login backdoor should allow you to do pretty much anything in terms of authentication. Our setup has a reverse proxy that takes care of ssl and ldap authentication that squeezes http headers in the request, and our airflow_login module just look for the headers to grant the right level of access. Using flask_login and ldap is probably fairly well documented. A temporary setup can be achieved with privileged users using an ssh tunnel.

@afarbos
Copy link
Author

afarbos commented Jun 12, 2015

Thanks.

whynick1 pushed a commit to whynick1/incubator-airflow that referenced this issue Mar 31, 2020
…orm-typo to 1.10.4-wepay

Squashed commit of the following:

commit 0dbb94942fd56e7734a984915cb718fe8ebd0f75
Author: Kun Zhou <[email protected]>
Date:   Mon Sep 9 14:13:23 2019 -0700

    Fix a typo that prevented grpc auth type persistence
mobuchowski pushed a commit to mobuchowski/airflow that referenced this issue Jan 4, 2022
* change package name to marquez_airflow

* Fix linter issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants