Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure AD RBAC enable/disable with variable rbac_aad #269

Merged
merged 1 commit into from
Jan 13, 2023

Conversation

zioproto
Copy link
Collaborator

Fixes #215

The current code was using the variable var.role_based_access_control_enabled to check if creating the dynamic block azure_active_directory_role_based_access_control.

This is wrong because there is no way to use just Kubernetes RBAC without enabling the dynamic block azure_active_directory_role_based_access_control.

Copy link
Member

@lonegunmanb lonegunmanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zioproto for opening this pr, unfortunately we have some e2e test failures here.

│ Error: Operation failed
        	            	│ 
        	            	│   on ../../main.tf line 96, in resource "azurerm_kubernetes_cluster" "main":
        	            	│   96:     for_each = var.role_based_access_control_enabled && var.rbac_aad_azure_rbac_enabled && var.rbac_aad_managed ? ["rbac"] : []
        	            	│     ├────────────────
        	            	│     │ var.rbac_aad_azure_rbac_enabled is null
        	            	│     │ var.role_based_access_control_enabled is true
        	            	│ 
        	            	│ Error during operation: argument must not be null.

Would you please fix the error and let's try again? Thanks!

@github-actions
Copy link
Contributor

MAIN BRANCH PUSH DETECTED DUE TO #, THIS PR NEED TO BE UPDATED TO TRIGGER CI.

1 similar comment
@github-actions
Copy link
Contributor

github-actions bot commented Nov 1, 2022

MAIN BRANCH PUSH DETECTED DUE TO #, THIS PR NEED TO BE UPDATED TO TRIGGER CI.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 1, 2022

MAIN BRANCH PUSH DETECTED DUE TO #272, THIS PR NEED TO BE UPDATED TO TRIGGER CI.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 2, 2022

MAIN BRANCH PUSH DETECTED DUE TO #273, THIS PR NEED TO BE UPDATED TO TRIGGER CI.

@github-actions
Copy link
Contributor

MAIN BRANCH PUSH DETECTED DUE TO #, THIS PR NEED TO BE UPDATED TO TRIGGER CI.

@zioproto zioproto force-pushed the issue-215 branch 2 times, most recently from 77670a5 to 727bea2 Compare November 11, 2022 12:25
@github-actions
Copy link
Contributor

MAIN BRANCH PUSH DETECTED DUE TO #271, THIS PR NEED TO BE UPDATED TO TRIGGER CI.

@github-actions
Copy link
Contributor

MAIN BRANCH PUSH DETECTED DUE TO #, THIS PR NEED TO BE UPDATED TO TRIGGER CI.

@github-actions
Copy link
Contributor

MAIN BRANCH PUSH DETECTED DUE TO #275, THIS PR NEED TO BE UPDATED TO TRIGGER CI.

@github-actions
Copy link
Contributor

MAIN BRANCH PUSH DETECTED DUE TO #, THIS PR NEED TO BE UPDATED TO TRIGGER CI.

@zioproto zioproto force-pushed the issue-215 branch 2 times, most recently from 10eec14 to 04dd7cf Compare November 14, 2022 09:24
@github-actions
Copy link
Contributor

Potential Breaking Changes in 04dd7cf:
[update] "Variables.rbac_aad_azure_rbac_enabled.Default" from 'null' to 'false'
[update] "Variables.rbac_aad_azure_rbac_enabled.Nullable" from '' to 'false'

Copy link
Member

@lonegunmanb lonegunmanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zioproto, I have a question for your pr.

main.tf Outdated
@@ -95,7 +95,7 @@ resource "azurerm_kubernetes_cluster" "main" {
}
}
dynamic "azure_active_directory_role_based_access_control" {
for_each = var.role_based_access_control_enabled && var.rbac_aad_managed ? ["rbac"] : []
for_each = var.role_based_access_control_enabled && var.rbac_aad_azure_rbac_enabled && var.rbac_aad_managed ? ["rbac"] : []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only way that we can set an azure_active_directory_role_based_access_control block with var.rbac_aad_managed = true is the azure_active_directory_role_based_access_control.azure_rbac_enabled = true, we cannot set it to null or false, is that correct? Why the provider allows us to configure azure_rbac_enabled's value?

Copy link
Collaborator Author

@zioproto zioproto Nov 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct. I think this is confusing because of the naming used by Terraform.

If you try with the CLI to enable just --enable-azure-rbac you get the error:

--enable-azure-rbac can only be used together with --enable-aad

The provider let you configure azure_rbac_enabled value because this is a valid config:

role_based_access_control_enabled = true
rbac_aad_azure_rbac_enabled = false
rbac_aad_managed = true

https://learn.microsoft.com/en-us/rest/api/aks/managed-clusters/create-or-update?tabs=HTTP

It looks like this in the API:

    "aadProfile": {
      "managed": true,
      "enableAzureRBAC": false
    },
    "enableRBAC": true,

@lonegunmanb
Copy link
Member

Hi @zioproto we got a failed test when we tried to provison example/startup:

TestExamplesStartup 2022-11-16T01:00:17Z logger.go:66: module.aks.azurerm_kubernetes_cluster.main: Refreshing state... [id=/subscriptions/xxxx/resourceGroups/eb0a5e2abd042c6f-rg/providers/Microsoft.ContainerService/managedClusters/prefix-eb0a5e2abd042c6f-aks]
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66: ╷
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66: │ Warning: Argument is deprecated
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66: │ 
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66: │   with azurerm_subnet.test,
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66: │   on main.tf line 31, in resource "azurerm_subnet" "test":
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66:31:   enforce_private_link_endpoint_network_policies = true
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66: │ 
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66: │ `enforce_private_link_endpoint_network_policies` will be removed in favour
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66: │ of the property `private_endpoint_network_policies_enabled` in version 4.0
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66: │ of the AzureRM Provider
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66: ╵
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66: ╷
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66: │ Error: retrieving Access Profile for Managed Cluster (Subscription: "xxxx"
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66: │ Resource Group Name: "eb0a5e2abd042c6f-rg"
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66: │ Resource Name: "prefix-eb0a5e2abd042c6f-aks"): managedclusters.ManagedClustersClient#GetAccessProfile: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="BadRequest" Message="Getting static credential is not allowed because this cluster is set to disable local accounts."
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66: │ 
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66: │   with module.aks.azurerm_kubernetes_cluster.main,
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66: │   on ../../main.tf line 17, in resource "azurerm_kubernetes_cluster" "main":
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66: │   17: resource "azurerm_kubernetes_cluster" "main" {
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66: │ 
TestExamplesStartup 2022-11-16T01:00:19Z logger.go:66:

@zioproto
Copy link
Collaborator Author

The test fails because: local_account_disabled = true

local_account_disabled = true

the only change of previous behavior in this PR is here:

https://github.com/Azure/terraform-azurerm-aks/pull/269/files#diff-dc46acf24afd63ef8c556b77c126ccc6e578bc87e3aa09a931f33d9bf2532fbbL102

The value of azure_rbac_enabled is false instead of before it was null.

I understand that azure_rbac_enabled set to null has the consequence that local_account_disabled is ignored.

This is probably a different bug, but I have to do more research to see if it comes from the Terraform provider or from AKS itself.

@zioproto
Copy link
Collaborator Author

There was a bug in my conditional logic so that the block azure_active_directory_role_based_access_control was never created, and for this reason the end to end test was failing.

I pushed now a new commit that introduces a new variable, the rbac_aad, in addition to rbac_aad_managed.

There was a misconception on my side, where my initial understand was that rbac_aad_managed was used to enable/disable the Azure AD integration. This is wrong.

The existence of the block azure_active_directory_role_based_access_control determines if the Azure AD integration is active or not.

The managed = true or managed = false inside the block have the following meaning:

@lonegunmanb please review now

Extra info: When the block azure_active_directory_role_based_access_control is missing completely (that was never tested before) it seems is still possible to give the directive --disable-local-accounts. However this will change in AKS 1.25+

variables.tf Outdated
variable "rbac_aad" {
type = bool
description = "(Optional) Is Azure Active Directory ingration enabled?"
default = false
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont like the idea of enabling this by default, so I have set this to false.
But this is a breaking change, because this will disable the azure_active_directory_role_based_access_control block.

I am in doubt if true is a better default to avoid breaking changes.

Cc:
@lonegunmanb

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As agreed in our meeting, we will set the default value to true because it is the most secure value for the users.

@zioproto zioproto temporarily deployed to acctests November 17, 2022 00:51 Inactive
@github-actions
Copy link
Contributor

MAIN BRANCH PUSH DETECTED DUE TO #, THIS PR NEED TO BE UPDATED TO TRIGGER CI.

1 similar comment
@github-actions
Copy link
Contributor

MAIN BRANCH PUSH DETECTED DUE TO #, THIS PR NEED TO BE UPDATED TO TRIGGER CI.

@davidspek
Copy link
Contributor

Also related too #137

@zioproto zioproto temporarily deployed to acctests November 30, 2022 00:45 Inactive
Copy link
Member

@lonegunmanb lonegunmanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zioproto for this update, almost LGTM but I have one change request and one question.

locals.tf Outdated Show resolved Hide resolved
main.tf Outdated

content {
admin_group_object_ids = var.rbac_aad_admin_group_object_ids
azure_rbac_enabled = var.rbac_aad_azure_rbac_enabled
azure_rbac_enabled = local.rbac_aad_azure_rbac_enabled
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we declare an aks resource with an azure_active_directory_role_based_access_control block, but omit it's azure_rbac_enabled argument? I mean, if we set null to this argument, is that a valid configuration?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a valid configuration. In the master branch the variable rbac_aad_azure_rbac_enabled has already a default value of null. This is because azure_rbac_enabled is optional.

This is a valid configuration and it means:

  • Enable the Active Directory integration for authentication
  • Do not use Active Director for RBAC

This is the equivalent of doing az aks create --enable-aad without the parameter --enable-azure-rbac.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid having the local and use directly the variable then ?

@zioproto
Copy link
Collaborator Author

@lonegunmanb could you please enable the E2E test for this PR again ? It think now it is good to test and merge. Thanks !

@zioproto zioproto changed the title Azure AD RBAC enable/disable with variable rbac_aad_azure_rbac_enabled Azure AD RBAC enable/disable with variable rbac_aad Jan 9, 2023
main.tf Outdated
@@ -256,6 +256,10 @@ resource "azurerm_kubernetes_cluster" "main" {
condition = local.automatic_channel_upgrade_check
error_message = "Either disable automatic upgrades, or only specify up to the minor version when using `automatic_channel_upgrade=patch` or don't specify `kubernetes_version` at all when using `automatic_channel_upgrade=stable|rapid|node-image`. With automatic upgrades `orchestrator_version` must be set to `null`."
}
precondition {
condition = var.role_based_access_control_enabled || (!var.role_based_access_control_enabled && !var.rbac_aad)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lonegunmanb could you please review if this precondition looks good to you ? thanks

This is the goal:

role_based_access_control_enabled rbac_aad condition
True True True
True False True
False True False
False False True

Copy link
Member

@lonegunmanb lonegunmanb Jan 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can simplify this condition to:

condition     = var.role_based_access_control_enabled || !var.rbac_aad

According to the error_message, we also can invert the statement to:

condition     = !(!var.role_based_access_control_enabled && var.rbac_aad)

This condition matches the error_message exactly.

Let's change this truth table a little to:

role_based_access_control_enabled !rbac_aad condition
True False True
True True True
False False False
False True True

So it's an or operation between role_based_access_control_enabled and !rbac_aad.

Copy link
Member

@lonegunmanb lonegunmanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zioproto, LGTM! 🚀

@lonegunmanb lonegunmanb merged commit f955123 into Azure:master Jan 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can't use role_based_access_control without azure_active_directory
3 participants