Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AIP-56 - FAB AM - Move FAB auth manager to new provider #32210

Closed
vincbeck opened this issue Jun 27, 2023 · 22 comments
Closed

AIP-56 - FAB AM - Move FAB auth manager to new provider #32210

vincbeck opened this issue Jun 27, 2023 · 22 comments
Assignees
Labels
AIP-56 Extensible user management

Comments

@vincbeck
Copy link
Contributor

It has been decided to keep this task for the end. The purpose of this task is to:

  • Decide whether is it feasible and a good idea to move the FAB auth manager from core Airflow to a new provider
  • If yes, create a new provider and move the auth manager in this newly created provider
  • Make it a dependency of Airflow since it will be used by default
@vincbeck
Copy link
Contributor Author

vincbeck commented Nov 8, 2023

I would like to start this discussion. I think everyone agrees that (if not, please call it out!),if possible, we should move the FAB auth manager to a new provider. If I remember well @jedcunningham had some concerns about this move, especially regarding the migrations right? Could you please elaborate a bit more on the challenge? I would like to see if there is no solution to go around that, 'cause I would really love to move this FAB auth manager to providers.

@potiuk

@jedcunningham
Copy link
Member

Yeah, my concern was around the db. Things like this and this. Today providers have no way to hook into the db creation/migration process.

@potiuk
Copy link
Member

potiuk commented Nov 8, 2023

I think it should not be too difficult to add though as a provider capability. I'v done that for configuration it took me few days, and DB migration seems like way simpler problem to solve - especially that we have building blocks in Airflow for that. Migration might be a big tricky - but ultimately we might only benefit from it.

@potiuk
Copy link
Member

potiuk commented Nov 8, 2023

It literally looks like a hook wher you can discover "if" a given provider supports migration and hook it in, and a way to query providers to see if they "need" migration (to account for independent upgrades/downgrades of providers from core) + handle back-compatibility - we aalredy have the DB quite separated - we know which tables belong to FAB and which not,.

Alembic alredy supports separate "migration environments' I believe (I am not too experienced with Alembic though). It would be a bit tricky to move before/ after migration but since FAB provider will likely come pre-installed, we might not worry too much about back-compat (i.e. airflow >= 2.8 for provider and mandtory FAB provider for Airflow should solve the back-compatibility problem nicely).

@jedcunningham
Copy link
Member

Unfortunately I think there is a bit more complexity here. The db cli is very "core" centric right now. We'd have to support downgrade --from-version, --to-version and friends, for not only core but all providers that do db things. Sure, a bare db upgrade is easy but it's also the simplest case. It's definitely solvable though.

If the FAB provider will be pre-installed, does moving this stuff to a provider really buy us anything? Even if you didn't use it, you'd still get the db stuff and code anyways (unless you went off the beaten path of course).

@potiuk
Copy link
Member

potiuk commented Nov 8, 2023

Unfortunately I think there is a bit more complexity here. The db cli is very "core" centric right now. We'd have to support downgrade --from-version, --to-version and friends, for not only core but all providers that do db things. Sure, a bare db upgrade is easy but it's also the simplest case. It's definitely solvable though.

If the FAB provider will be pre-installed, does moving this stuff to a provider really buy us anything? Even if you didn't use it, you'd still get the db stuff and code anyways (unless you went off the beaten path of course).

Yes. There are multiple benefits of having FAB as provider even if it is preinstalled.

For example it would be possible to remove it more easly in next versions - we might make NOT preinstalled in 2.9 for example.

Right now when we moved daskexecutor to provider, we KNOW we can move it out. Also when FAB is in provider we can potentially suspend it's development, tests and releases when we decide that we do not want to maintain it any more. This might mean for example that we stop developing it main and it does not hold us back (people might simply continue using the version that is released already.

But there is way more. It will NOT keep airflow tied to specific version of FAB - we might have 5 versions of FAB provider tied to specific FAB version - each of them linked to a different FAB version by pinning. Which might unblock some scenarios. For example people might want to upgrade to latest FAB version because of security issue while something might still be keeping them with older Airflow version.

Or the opposite - for some reasons (changed mechanism of OAUT authentication as happened in the past) people might want to stick to old FAB while upgrading to newer Airflow. We never really told that to people - but in Airflow 2 we actually HAD breaking changes becasue FAB changed the way how you configure OAUTH integration. It broke a number of people integratiosn in 2.2 or 2.3. Having FAB provider separately we could make a breaking release of provider while allowing people to stay with the old version of provider and still upgrade to newer version of Airflow.

So yeah. There are plenty of benefits of splitting when it comes to a number of different scenarios for our users (most of them happened in the past so well it's not theorethical).

And really whether we eventually make FAB truly optional is another story but we once we do separate it to provider, it just opens up all kinds of options we might have and know what it brings. If we won't do it, we cannot even start a discussion about it becase just not knowing what it involves (like this migration for example) will hold us back from having discussion on whether we want to do it at all or not and what benefits it will bring.

@potiuk
Copy link
Member

potiuk commented Nov 8, 2023

Also. purely from the speed of delivering security fixes (which we had plenty of in FAB). We could also handle bugfixes faster - we would not have to wait with releasing a security release for the next airflow version - it could just come in the next provider's wave.

@potiuk
Copy link
Member

potiuk commented Nov 8, 2023

(I just added it to "security improvements" project because of the last point cc: @hussein-awala @pierrejeambrun @eladkal)

@potiuk
Copy link
Member

potiuk commented Nov 8, 2023

Unfortunately I think there is a bit more complexity here. The db cli is very "core" centric right now. We'd have to support downgrade --from-version, --to-version and friends, for not only core but all providers that do db things. Sure, a bare db upgrade is easy but it's also the simplest case. It's definitely solvable though.

Yeah. I think we would have to have a separate airflow providers db CLI + I think only the default migration should automically call the "provider's" one, the "from/to" migration would only do the "core" or "specific provider" migration (for airflow providers db <provider>) This would have to be added (like the few things I added for config) but mostly mimicking what the main db command is doing. By building blocks I mean we know already what are the use cases and options, and we have the code, so implementing similar approach by either refactoring and DRY'ing the code or by copying bits and pieces should be way easier than doing that all from the scratch.

@jedcunningham
Copy link
Member

Fair point on being able to decouple core from a specific FAB version. I hadn't considered that aspect, and that does make it worthwhile.

I worry that having airflow db migrate do core + providers, but the rest just core will be confusing. I'm sure we can come up with something that makes sense though.

@hussein-awala
Copy link
Member

Also. purely from the speed of delivering security fixes (which we had plenty of in FAB). We could also handle bugfixes faster - we would not have to wait with releasing a security release for the next airflow version - it could just come in the next provider's wave.

+1

For the db command, I think it will be complex to ensure compatibility between all the combinations (Airflow since min version, FAB provider version), so I wonder if for this new proposed provider, we can apply a different strategy for min_airflow_version, by updating it on each release to latest Airflow version, in this case we can benefit from the frequent releases, without worrying about backward compatibility.

@potiuk
Copy link
Member

potiuk commented Nov 9, 2023

Yeah. It might simplify things indeed. Rather than merging the commands we could keep them separate and simply provider could check at the initialization of the auth manager if migration is done and sys.exit if migration is not applied for it.

Maybe we even do not have to have airflow providers db command, maybe simply FAB authenticator (when configured as being currently used) could contribute CLI command - in the similar way as "currently selected excutor" does by @o-nikolas ).

That would make a change in tthe current upgrade process (and helm chart and others)

airflow db upgrade
airflow fab db upgrade

For example. That would be deinitely cleanest solution. We've discussed it before and I believe we agreed that Semver says nothig about 'stability' of migration process, so if the migration to new FAB as authentication will require separate airflow fab db upgrade command - so be it.

And ultimately, I think we want people to realise FAB is separate part and that they can switch to their own authentication and that it needs separate "maintenance". It ticks all the boxes for me.

@vincbeck
Copy link
Contributor Author

Resolved as part of #35926

@vincbeck
Copy link
Contributor Author

vincbeck commented Dec 11, 2023

Should we also move the config that is used only by the FAB auth manager to the provider config? If so, how to do that in a backward compatible way? Can a config in a provider define a config section that already exist in core? As far as I remember there was a discussion about it and (if my memory is good, bold bet!) it has been decided to not have any control in place but "trust" the community to avoid such collisions. If this is true, could we have config from one section defined in 2 different places (core and providers)?

@potiuk
Copy link
Member

potiuk commented Dec 11, 2023

Good point. Since FAB provider wlll have >= 2.9.0, it's perfectly fine to simply move it directly to provider.

@potiuk
Copy link
Member

potiuk commented Dec 11, 2023

And longer explanation -> both will produce the very same config entries, it's just the matter to not have code that handles the same section from both provider and core. But since:

  • for airflow < 2.9.0 - there will never be FAB provider installed - > the section is "managed" by airflow
  • for airflow >= 2.9.0 - there will not be configuraiton in airflow and there will be fab provider (and configuration is managed by the provider).

@vincbeck
Copy link
Contributor Author

But there are some configs in the section webserver which should stay in core Airflow

@vincbeck
Copy link
Contributor Author

vincbeck commented Dec 11, 2023

We cannot just move everything to fab provider, some are not related to fab provider

@potiuk
Copy link
Member

potiuk commented Dec 11, 2023

then we need to take more nuanced approach and implement deprecation. I think we should do it in two stages - first add deprecation and extract everything to separate section and then move section to FAB.

We already have mechanism in Airlfo config to add deprecations and fallback from new options to old ones.

@vincbeck
Copy link
Contributor Author

Sounds good

@vincbeck
Copy link
Contributor Author

vincbeck commented Dec 12, 2023

I was just starting this task but I did not think about the consequences. If we deprecate configs in the webserver section in core Airflow and copy these to the FAB provider under fab section. It means I need to make some massive updates on the code and every-time there is a conf.getboolean("webserver", "<config>"), I need to basically do conf.getboolean("webserver", "<config>") or conf.getboolean("fab", "<config>"). I am not sure we want to go down that path. WDYT?

@vincbeck
Copy link
Contributor Author

Not so much actually: #36232

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AIP-56 Extensible user management
Projects
No open projects
Development

No branches or pull requests

4 participants