Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat dag-defined access_control as authoritative if defined #33632

Merged
merged 6 commits into from
Aug 24, 2023

Conversation

SamWheating
Copy link
Contributor

@SamWheating SamWheating commented Aug 22, 2023

closes: #32839

(Copying my comment from the original issue)

We have two places to define permissions (manually via the UI/CLI and in code via DAG.access_control) and there's no stance on which one takes precedence, which leads to all sorts of unexpected behaviour (mostly around not cleaning up permissions enough, or cleaning up permissions too aggressively)

I would propose making the access control for a given DAG authoritative as long as it is explicitly set. This way, if a DAG was updated, removing access_control from a DAG would not trigger a cleanup, but setting access_control={} would.

So we have three cases. A DAG defined like this:

with DAG(
    dag_id="test",
    schedule="0 0 * * *",
    start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
    access_control={'role1': {'can_read'}, 'role2': {'can_read', 'can_edit', 'can_delete'}}
) as dag:

Would overwrite any existing dag-level permissions for this DAG, ensuring that the roles listed here are the only ones with permissions specific to this DAG.

While a DAG with explicitly empty access_contol, like this:

with DAG(
    dag_id="test",
    schedule="0 0 * * *",
    start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
    access_control={}
) as dag:

Would explicitly clear out all DAG-level permissions for this role.

While a DAG without any access_control defined:

with DAG(
    dag_id="test",
    schedule="0 0 * * *",
    start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
) as dag:

Would just defer to the existing DAG-level roles in the database.

This presents its own hazards around dangling and stray permissions, but I think that any approach here is going to have some sort of downside. The important part is that the approach to DAG-level access is clearly-defined and well documented (neither of which is true at the moment).

TODO (before merging):

I'll do these things as a part of this PR, but I want to get early feedback on the approach before I spend too much time here.

  • Add documentation around the precedence of different DAG-level access controls.
  • Update and add tests

@boring-cyborg boring-cyborg bot added the area:webserver Webserver related Issues label Aug 22, 2023
@SamWheating SamWheating changed the title Proposal: treat dag-defined access_control as authoritative only if it exists Proposal: treat dag-defined access_control as authoritative if defined Aug 22, 2023
@SamWheating SamWheating marked this pull request as ready for review August 22, 2023 22:50

if access_control:

if access_control is not None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the most important change in this PR - if access control will proceed into the following block if and only if access_control is a non-empty dict, whereas if access control is not None will proceed given any dict, including an empty one.

@SamWheating
Copy link
Contributor Author

cc @potiuk - follow-up on our conversation yesterday. Have a look at the description here and let me know if you agree.

@potiuk
Copy link
Member

potiuk commented Aug 23, 2023

Yes. I agree with the reasoning.

Rebasing now to fix problem with image builds from earlier (autocomplete from the phone was strange last night :))

@potiuk potiuk changed the title Proposal: treat dag-defined access_control as authoritative if defined Treat dag-defined access_control as authoritative if defined Aug 24, 2023
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Needs static code check /docs fixing. I like the docs and explanation and finally a bit more clarity on what the users should expect.

:start-after: [START howto_no_fine_grained_access_control]
:end-before: [END howto_no_fine_grained_access_control]

- Removing the access_control block from a DAG altogether (or setting it to `None`) can leave dangling permissions.
Copy link
Member

@potiuk potiuk Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. This is a side effect that we can accept.

One small comment. Maybe a good idea would be to also accompany it with a new CLI command airflow dags clean-permissions and a new API call (eventually) to do it. Those can be separate PRs, but having them explicitly spelled out as way to clean permissions on a DAG that previously had permissions is a way to workaround the problem with dangling permissions.

I see that as an edge case, with this change there is already a way to clean permissions by setting access_control to {} - but I can see a situation that somoene would like to recover from the historical changes.

Copy link
Contributor

@huymq1710 huymq1710 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thank you very much

@potiuk
Copy link
Member

potiuk commented Aug 24, 2023

Some static checks :(, Recommend installing pre-commit. Also breeze statich-checks --only-my-changes is helpful.

@SamWheating
Copy link
Contributor Author

breeze static-checks --only-my-changes

Good to know, I was just running ruff + black checks but it looks like there's a few more failing 👀 I'll use this command in the future.

@potiuk
Copy link
Member

potiuk commented Aug 24, 2023

breeze static-checks --only-my-changes

Good to know, I was just running ruff + black checks but it looks like there's a few more failing 👀 I'll use this command in the future.

Highly recommend just pre-commit install. I generally almost never run any static checks manually for airflow. It happens all automagically with pre-commits, so I just don't think about them and fix them (or get them fixed automatically) when I commit stuff.

@potiuk
Copy link
Member

potiuk commented Aug 24, 2023

Still small thing in code blocks :(

@potiuk potiuk merged commit 370348a into apache:main Aug 24, 2023
42 checks passed
@potiuk
Copy link
Member

potiuk commented Aug 24, 2023

🎉

@ephraimbuddy ephraimbuddy added the type:bug-fix Changelog: Bug Fixes label Aug 28, 2023
ephraimbuddy pushed a commit that referenced this pull request Aug 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:webserver Webserver related Issues type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DAG-level permissions set in Web UI disappear from roles on DAG sync
4 participants