Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Azure AD Workload Identity support for Azure Scalers and Key Vault #2907

Merged
merged 30 commits into from
May 10, 2022

Conversation

v-shenoy
Copy link
Contributor

@v-shenoy v-shenoy commented Apr 13, 2022

Signed-off-by: Vighnesh Shenoy [email protected]

Support for Workload Identity as a pod identity provider.

Related PRs -
Helm Changes - kedacore/charts#263, kedacore/charts#264
Doc changes - kedacore/keda-docs#752

Checklist

  • Commits are signed with Developer Certificate of Origin (DCO - learn more)
  • Tests have been added
  • Changelog has been updated and is aligned with our changelog requirements
  • A PR is opened to update our Helm chart (repo) (if applicable, ie. when deployment manifests are modified)
  • A PR is opened to update the documentation on (repo) (if applicable)

Relates to #2487

Copy link
Member

@tomkerkhove tomkerkhove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, added a few remarks.

Would you mind opening a PR for the docs please?

pkg/scalers/azure/azure_eventhub.go Show resolved Hide resolved
pkg/scalers/azure_monitor_scaler.go Show resolved Hide resolved
@v-shenoy
Copy link
Contributor Author

LGTM, added a few remarks.

Would you mind opening a PR for the docs please?

Yeah, I have some minor changes left to do. Will get onto the docs after I am done with those.

@v-shenoy v-shenoy marked this pull request as ready for review April 22, 2022 13:29
@v-shenoy v-shenoy requested a review from a team as a code owner April 22, 2022 13:29
@v-shenoy
Copy link
Contributor Author

@tomkerkhove @JorTurFer @zroubalik Would be helpful if you can run all the azure* e2e tests.

@v-shenoy v-shenoy changed the title Azure AD Workload Identity support for Azure Scalers. Azure AD Workload Identity support for Azure Scalers and Key Vault. Apr 22, 2022
@tomkerkhove
Copy link
Member

tomkerkhove commented Apr 25, 2022

/run-e2e azure*
Update: You can check the progres here

@v-shenoy
Copy link
Contributor Author

v-shenoy commented Apr 25, 2022

@tomkerkhove I am not sure what changed exactly, but the Application Insight tests started failing recently (even on my dev cluster). Looking at the error, it seemed that the library was not able to send the metrics correctly with just the instrumentation key, and using the connection string resolves the issue.

Could you add AZURE_APP_INSIGHTS_CONNECTION_STRING GitHub secret, and rerun the azure* e2e tests?
(Note - I have already updated the workflow yamls with the env variable.)

@tomkerkhove
Copy link
Member

PR is open: #2934

@JorTurFer
Copy link
Member

JorTurFer commented Apr 25, 2022

/run-e2e azure*
Update: You can check the progres here

@JorTurFer
Copy link
Member

(Note - I have already updated the workflow yamls with the env variable.)

It's not enough, the yaml needs to be in main to be applied. Please undo the changes in the workflow yamls 🙏

@v-shenoy
Copy link
Contributor Author

v-shenoy commented Apr 25, 2022

@tomkerkhove @JorTurFer I want to raise a PR for the documentation here, but before that we would need to finalize upon and complete the helm changes.

I can do the helm changes if someone guides me on how to go about them.

@tomkerkhove
Copy link
Member

Why does the docs have a pre-requisite on Helm? Normally that should not be the case.

Our Helm charts are available in https://github.com/kedacore/charts where you can just open a PR to change the YAML specs (no need to bump versions) Let me know if you need help.

@v-shenoy
Copy link
Contributor Author

Why does the docs have a pre-requisite on Helm? Normally that should not be the case.

Our Helm charts are available in https://github.com/kedacore/charts where you can just open a PR to change the YAML specs (no need to bump versions) Let me know if you need help.

Similar to pod identity, we need to mention the flags the user can use, right?

@v-shenoy
Copy link
Contributor Author

v-shenoy commented Apr 26, 2022

I will go ahead with the helm changes as per the flags mentioned in this comment. Had a question regarding this. There are other changes in the main KEDA code, that also require helm changes (ex - TriggerAuth / ClusterTriggerAuth CRD changes done for Key Vault). Do I need to copy those changes over as well, right now, or do I just add my changes and the rest will be copied over during the release process?

For context, I only need to change the service account template and values used for helm.

@zroubalik

@tomkerkhove
Copy link
Member

Why does the docs have a pre-requisite on Helm? Normally that should not be the case.
Our Helm charts are available in kedacore/charts where you can just open a PR to change the YAML specs (no need to bump versions) Let me know if you need help.

Similar to pod identity, we need to mention the flags the user can use, right?

You can already change v2.7 docs; there's no problem there as they only go live when we ship v2.7.

Our new Helm chart version will only be available then as well.

@v-shenoy
Copy link
Contributor Author

You can already change v2.7 docs; there's no problem there as they only go live when we ship v2.7.

No, I was confused about the specific flags to add in docs as I was not sure if we had finalized them :)

@v-shenoy
Copy link
Contributor Author

v-shenoy commented Apr 26, 2022

Helm Changes - kedacore/charts#263, kedacore/charts#264

@v-shenoy
Copy link
Contributor Author

Doc changes - kedacore/keda-docs#752

@v-shenoy
Copy link
Contributor Author

@tomkerkhove Could you once again run the e2e tests? Thanks.

@tomkerkhove
Copy link
Member

tomkerkhove commented Apr 26, 2022

/run-e2e azure*

Update: You can check the progres here

@tomkerkhove tomkerkhove changed the title Azure AD Workload Identity support for Azure Scalers and Key Vault. feat: Azure AD Workload Identity support for Azure Scalers and Key Vault Apr 26, 2022
@v-shenoy
Copy link
Contributor Author

No tests seem to be running here, weirdly.

@JorTurFer
Copy link
Member

JorTurFer commented May 9, 2022

@v-shenoy , rebase main again please, there is a missing line
It's there, running

@JorTurFer
Copy link
Member

JorTurFer commented May 9, 2022

/run-e2e azure*
Update: You can check the progres here

@JorTurFer
Copy link
Member

JorTurFer commented May 9, 2022

Checking the cluster, there are resources related with aad-wi even do the e2e tests have finished
image
Is this expected?

@v-shenoy
Copy link
Contributor Author

v-shenoy commented May 9, 2022

Checking the cluster, there are resources related with aad-wi even do the e2e tests have finished image Is this expected?

@JorTurFer
I was initially installing these components in the setup file, and removing them during the clean up. I ran into some issues where the tests would randomly fail in this scenario. I wasn't able to figure out what exactly was causing these random failures. So, I removed the clean up, considering we would need the components on the test cluster anyway.

My guess is that the failures were due to the webhook components not being ready by the time keda is deployed. I have added some sleep in the setup section to try and resolve this issue.

You can try running the test again. I would want for the test to at least pass twice before we can begin trusting them.

@JorTurFer
Copy link
Member

JorTurFer commented May 9, 2022

/run-e2e azure*
Update: You can check the progres here

@JorTurFer
Copy link
Member

JorTurFer commented May 9, 2022

/run-e2e azure*
Update: You can check the progres here

@JorTurFer
Copy link
Member

JorTurFer commented May 10, 2022

/run-e2e azure*
Update: You can check the progres here

@JorTurFer
Copy link
Member

JorTurFer commented May 10, 2022

Personally, I don't have problems with keeping aad-wi there because there are e2e clusters and they are for that, but one question. Are both clusters ready (from Azure pov) or only this PR-e2e clusters is ready?
I mean, all needed resources/grants in Azure and their manage identities

Copy link
Member

@JorTurFer JorTurFer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
Thanks for this huge effort

@v-shenoy
Copy link
Contributor Author

v-shenoy commented May 10, 2022

@JorTurFer I am fine with either, keeping the components in the cluster or cleaning them up. Right now, the tests seem to be working with the clean-up and sleep included, so I guess we can keep them. The reason this test run failed was that both the normal service bus queue and workload identity tests ran together at the same time.

If you want to be sure, you can run just the workload identity tests, regex - *workload-identity*.

As for the permissions / grants, they have been done for both the pr-e2e and main-build clusters. But obviously they have not been run on the other cluster yet, so after merging we will need to check that they work over there as well.

@JorTurFer
Copy link
Member

The reason this test run failed was that both the normal service bus queue and workload identity tests ran together at the same time.

wow, that's true and I didn't notice! Please use another queue for this test to avoid fails in case of running both at the same time (like it has happened). I mean, changing the queue name should be enough for avoiding the collisions during tests, right?

@JorTurFer
Copy link
Member

Right now, the tests seem to be working with the clean-up and sleep included, so I guess we can keep them

yes, if it's already the behavior, lets keep it

@v-shenoy
Copy link
Contributor Author

v-shenoy commented May 10, 2022

The reason this test run failed was that both the normal service bus queue and workload identity tests ran together at the same time.

wow, that's true and I didn't notice! Please use another queue for this test to avoid fails in case of running both at the same time (like it has happened). I mean, changing the queue name should be enough for avoiding the collisions during tests, right?

Yup, it just kind of slipped my mind that this would be a problem when the tests ran parallelly. Changed.

Could you run the tests? (Hopefully last time, fingers crossed).

@JorTurFer
Copy link
Member

JorTurFer commented May 10, 2022

/run-e2e azure-service-bus*
Update: You can check the progres here

@tomkerkhove
Copy link
Member

tomkerkhove commented May 10, 2022

/run-e2e azure*

Update: You can check the progres here

@v-shenoy
Copy link
Contributor Author

@JorTurFer @tomkerkhove Any more work to be done here?

@JorTurFer
Copy link
Member

hey @v-shenoy
Sorry for the slow answer, I have been busy with the hotfix release. It's all, thanks for the huge effort you have done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants