Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move Rest API auth related methods to FAB auth manager #34924

Merged
merged 9 commits into from
Oct 20, 2023

Conversation

vincbeck
Copy link
Contributor

Multiple authentications are possible for the Rest API. See documentation. With AIP-56, all these configurations besides the session one (which is the default one) should belong to the FAB auth manager.

No new code is created or being removed in this PR, it is only being moved.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@vincbeck vincbeck added the AIP-56 Extensible user management label Oct 13, 2023
@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:webserver Webserver related Issues labels Oct 13, 2023
@vincbeck vincbeck added the use public runners Makes sure that Public runners are used even if commiters creates the PR (useful for testing) label Oct 13, 2023
@vincbeck vincbeck closed this Oct 13, 2023
@vincbeck vincbeck reopened this Oct 13, 2023
@potiuk
Copy link
Member

potiuk commented Oct 18, 2023

I think we should make it explicit in docs that those authentication backends for the API are available only when FAB auth manager is used. Likely we should also rename/depreceate the configuration option and make it more of "fab_authentication_manage" option - to configure the backends. This should become a configuration option of "FAB" one effectively which backend is used.

There are two things to add (and all this might be part of separate PR I just wanted to make sure that we think about it).

  • I understand, that in case of the Auth Manager, it will be possible to implement the authentication for API as part of the auth manager and Auth Manager will handle API authentication ? If so, I think we need to describe it.

  • In case of UI - it uses API calls using "session" backend and, it should be somewhat independent of the Auth Manager backend - so basically all API calls that have a valid flask session, should be allowed always (this is what has been added at some point in time I believe that we automatically add "session" backend when we did not add it explicitly in the backend configuration, so I think we need to figure out how to approach it for Auth Managers.

@vincbeck
Copy link
Contributor Author

Multiple answers here :)

I think we should make it explicit in docs that those authentication backends for the API are available only when FAB auth manager is used. Likely we should also rename/depreceate the configuration option and make it more of "fab_authentication_manage" option - to configure the backends. This should become a configuration option of "FAB" one effectively which backend is used.

I agree. But this one depends on whether we want to move this FAB auth manager to a separate provider. If so, then the configuration will be moved to the provider configuration (nice feature you needed :)). If not, then, as you said, we'll just have to rename these configurations. There is an issue for this one: #32210.

There are two things to add (and all this might be part of separate PR I just wanted to make sure that we think about it).

  • I understand, that in case of the Auth Manager, it will be possible to implement the authentication for API as part of the auth manager and Auth Manager will handle API authentication ? If so, I think we need to describe it.

The authentication for the AIP has not changed. The authentication check is still on here. The only difference being, in case of an auth manager different from FAB, get_airflow_app().api_auth will always return the session auth because the others belong to the FAB auth manager (unless the auth manager provide additional ways). I am not sure I replied to your question tho :)

  • In case of UI - it uses API calls using "session" backend and, it should be somewhat independent of the Auth Manager backend - so basically all API calls that have a valid flask session, should be allowed always (this is what has been added at some point in time I believe that we automatically add "session" backend when we did not add it explicitly in the backend configuration, so I think we need to figure out how to approach it for Auth Managers.

I guess by UI you refer to the React UI (not the model views auto generated by FAB, these ones do not use APIs). The way I see it and is done, for UI and API, when the backend received a request, it checks if the user is logged in using is_logged_in API. Are you saying that we should decorelate UI and API and have a different mechanism?

@potiuk
Copy link
Member

potiuk commented Oct 18, 2023

Multiple answers here :)

Because there were multiple questions :) .

I think we should make it explicit in docs that those authentication backends for the API are available only when FAB auth manager is used. Likely we should also rename/depreceate the configuration option and make it more of "fab_authentication_manage" option - to configure the backends. This should become a configuration option of "FAB" one effectively which backend is used.

I agree. But this one depends on whether we want to move this FAB auth manager to a separate provider. If so, then the configuration will be moved to the provider configuration (nice feature you needed :)). If not, then, as you said, we'll just have to rename these configurations. There is an issue for this one: #32210.

Cool.

  • I understand, that in case of the Auth Manager, it will be possible to implement the authentication for API as part of the auth manager and Auth Manager will handle API authentication ? If so, I think we need to describe it.

The authentication for the AIP has not changed. The authentication check is still on here. The only difference being, in case of an auth manager different from FAB, get_airflow_app().api_auth will always return the session auth because the others belong to the FAB auth manager (unless the auth manager provide additional ways). I am not sure I replied to your question tho :)

Obviously. I missed the fact that "session_auth" remained where it was. I see that "session" will remain where it was. Clear. And expectation is that if you implement an AuthManger api_auth should return the session.

But that sparks another question. Maybe (just maybe and maybe that is the next step) we should not be to hasty to move few others to FAB? What if we also keep dany_all and default in "api" rather than move them to FAB ? They do not seem to have any relation to FAB whatsoever and are directly reusable - for example if you would like to have no API, you'd get AuthManager return deny_all in api_auth property? That seems like very easy thing to do (especially if we describe that those are example api_auth methods that can be returned. Then FAB Auth Manager would simply become the first of the Auth Managers that will use them (when configured in the FAB Auth Manager configuration).

And maybe (that's a bit more tricky and maybe it is next step or follow-up) - we also keep kerberos_auth in "reusable" api package, only we parameterise them in the way, that they can be easiy used by Auth Manager - for example get kerberos_auth and expose find_user method. Then anyone implementing their AuthManager could choose to return kerberos_auth (adding just find_user implementation returning the right user ). I think there is much more "airflow/kerberos" code there than "FAB" in the kerberos_auth - it looks like "almost reusable" implementation of kerberos authentication - nicely integrated with how airflow kerberos CLI works with refreshing the kerberos token - the only thing it needs is find_user implementation.

  • In case of UI - it uses API calls using "session" backend and, it should be somewhat independent of the Auth Manager backend - so basically all API calls that have a valid flask session, should be allowed always (this is what has been added at some point in time I believe that we automatically add "session" backend when we did not add it explicitly in the backend configuration, so I think we need to figure out how to approach it for Auth Managers.

I guess by UI you refer to the React UI (not the model views auto generated by FAB, these ones do not use APIs). The way I see it and is done, for UI and API, when the backend received a request, it checks if the user is logged in using is_logged_in API. Are you saying that we should decorelate UI and API and have a different mechanism?

Yes. That I missed the fact that we are leaving the session and not moving it. But...

As far as I understand (and I might be wrong), currently the way how to configure backends in order to get the UI working is: airflow.api.auth.backend.basic_auth,airflow.api.auth.backend.session. This will allow the API user to use basic authentication, but also the React UI to communicate with the webserver via the same API (session). Does it mean that api_auth method of AuthManager will return array of auths as well? Or do we expect AuthManagers to return api_auth that will already handle session authentication ? Or maybe we will always append the "session" auth backend when we check for API - regardless from Auth Manager ? I could have missed that so sorry, If I am asking naive questions here :).

@vincbeck
Copy link
Contributor Author

vincbeck commented Oct 18, 2023

Obviously. I missed the fact that "session_auth" remained where it was. I see that "session" will remain where it was. Clear. And expectation is that if you implement an AuthManger api_auth should return the session.

But that sparks another question. Maybe (just maybe and maybe that is the next step) we should not be to hasty to move few others to FAB? What if we also keep dany_all and default in "api" rather than move them to FAB ? They do not seem to have any relation to FAB whatsoever and are directly reusable - for example if you would like to have no API, you'd get AuthManager return deny_all in api_auth property? That seems like very easy thing to do (especially if we describe that those are example api_auth methods that can be returned. Then FAB Auth Manager would simply become the first of the Auth Managers that will use them (when configured in the FAB Auth Manager configuration).

And maybe (that's a bit more tricky and maybe it is next step or follow-up) - we also keep kerberos_auth in "reusable" api package, only we parameterise them in the way, that they can be easiy used by Auth Manager - for example get kerberos_auth and expose find_user method. Then anyone implementing their AuthManager could choose to return kerberos_auth (adding just find_user implementation returning the right user ). I think there is much more "airflow/kerberos" code there than "FAB" in the kerberos_auth - it looks like "almost reusable" implementation of kerberos authentication - nicely integrated with how airflow kerberos CLI works with refreshing the kerberos token - the only thing it needs is find_user implementation.

That's very good point actually. That makes sense to me. session, default and deny_all should stay where they are. I'll modify the PR accordingly. Regarding kerberos, I agree with you, it is MOSTLY independent besides the usage of find_user. My only concern is, it looks "expensive" to force auth managers to implement find_user just for this usage. Some auth managers might not even want to use API, or Kerberos and they'll have to implement find_user. Another solution would be to return NotImplemented by default and try/catch this exception. If this exception is thrown, then no Kerberos

@potiuk
Copy link
Member

potiuk commented Oct 18, 2023

Another solution would be to return NotImplemented by default and try/catch this exception. If this exception is thrown, then no Kerberos

I'd say we could just leave a base kerberos_auth that will be PARAMETERIZED with find_user method, You could do it in this way:

kerberos_auth_base.py:

def requires_authentication(function: T, find_user: Callable[..right type here]):
    ,,,,
     
    g.user = find_user(username=ctx.kerberos_user)

fab/kerberos_auth.py:

from .. import kerberos_auth_base
from functools import partial

requires_authentication = partial(kerberos_auth_base, find_user=get_airflow_app().appbuilder.sm.find_user)

Might be a bit over-the-top, but I think it solves reusability in a very nice way

@vincbeck
Copy link
Contributor Author

vincbeck commented Oct 18, 2023

Another solution would be to return NotImplemented by default and try/catch this exception. If this exception is thrown, then no Kerberos

I'd say we could just leave a base kerberos_auth that will be PARAMETERIZED with find_user method, You could do it in this way:

kerberos_auth_base.py:

def requires_authentication(function: T, find_user: Callable[..right type here]):
    ,,,,
     
    g.user = find_user(username=ctx.kerberos_user)

fab/kerberos_auth.py:

from .. import kerberos_auth_base
from functools import partial

requires_authentication = partial(kerberos_auth_base, find_user=get_airflow_app().appbuilder.sm.find_user)

Might be a bit over-the-top, but I think it solves reusability in a very nice way

Nice! This way, you just need to do something similar if you want your auth manager to provider Kerberos support!

@potiuk
Copy link
Member

potiuk commented Oct 18, 2023

Nice! This way, you just need to do something similar if you want your auth manager to provider Kerberos support!

Precisely

@vincbeck
Copy link
Contributor Author

As far as I understand (and I might be wrong), currently the way how to configure backends in order to get the UI working is: airflow.api.auth.backend.basic_auth,airflow.api.auth.backend.session. This will allow the API user to use basic authentication, but also the React UI to communicate with the webserver via the same API (session). Does it mean that api_auth method of AuthManager will return array of auths as well? Or do we expect AuthManagers to return api_auth that will already handle session authentication ? Or maybe we will always append the "session" auth backend when we check for API - regardless from Auth Manager ? I could have missed that so sorry, If I am asking naive questions here :).

Here is the way I see it :) There is no such api_auth in auth managers. There are API auth provided by Airflow itself (session, default and deny_all) and API auth provided by auth managers (e.g. basic_auth for FAB auth manager). There is no need to have an API which return the list of API auth. When it comes to configuring which API auth you want to use you could do something like this: airflow.auth.managers.fab.api.auth.backend.basic_auth,airflow.api.auth.backend.session.

@potiuk
Copy link
Member

potiuk commented Oct 18, 2023

Here is the way I see it :) There is no such api_auth in auth managers. There are API auth provided by Airflow itself (session, default and deny_all) and API auth provided by auth managers (e.g. basic_auth for FAB auth manager). There is no need to have an API which return the list of API auth. When it comes to configuring which API auth you want to use you could do something like this: airflow.auth.managers.fab.api.auth.backend.basic_auth,airflow.api.auth.backend.session.

Ah... yes. Obviously ! That makes perfect sense

@vincbeck vincbeck merged commit 48c17fb into apache:main Oct 20, 2023
44 checks passed
@vincbeck vincbeck deleted the vincbeck/api_auth branch October 20, 2023 14:33
@ephraimbuddy ephraimbuddy added the changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) label Oct 27, 2023
@ephraimbuddy ephraimbuddy added this to the Airflow 2.8.0 milestone Oct 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AIP-56 Extensible user management area:API Airflow's REST/HTTP API area:webserver Webserver related Issues changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) use public runners Makes sure that Public runners are used even if commiters creates the PR (useful for testing)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants