Skip to content
This repository has been archived by the owner on Jan 11, 2023. It is now read-only.

Availability zone support for k8s #3453

Merged
merged 10 commits into from
Aug 30, 2018
Merged

Conversation

sozercan
Copy link
Member

@sozercan sozercan commented Jul 9, 2018

What this PR does / why we need it:

  • Adds support for availability zones infrastructure support for agents
  • Adds support for singlePlacementGroups for VMSS agent nodes

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #1919

Special notes for your reviewer:
Work in progress - do not merge

If applicable:

  • documentation
  • unit tests
  • tested backward compatibility (ie. deploy with previous version, upgrade with this branch)

Release note:

Add availability zone support for k8s agents

//cc @ritazh @kkmsft @khenidak @feiskyer

@feiskyer
Copy link
Member

feiskyer commented Aug 1, 2018

We should wait a while for kubernetes/enhancements#586, so as to support zoned AzureDisks as well.

@aheumaier
Copy link

@sozercan @khenidak Does it make sense to get this two topics separated to get this feature through ?

@feiskyer feiskyer mentioned this pull request Aug 16, 2018
@sozercan
Copy link
Member Author

sozercan commented Aug 16, 2018

@aheumaier: @ritazh is working on this PR currently

@codecov
Copy link

codecov bot commented Aug 16, 2018

Codecov Report

Merging #3453 into master will increase coverage by 0.04%.
The diff coverage is 57.44%.

@@            Coverage Diff             @@
##           master    #3453      +/-   ##
==========================================
+ Coverage   55.41%   55.46%   +0.04%     
==========================================
  Files         108      108              
  Lines       16102    16146      +44     
==========================================
+ Hits         8923     8955      +32     
- Misses       6416     6425       +9     
- Partials      763      766       +3

@acs-bot acs-bot added size/L and removed size/M labels Aug 17, 2018
@ritazh ritazh force-pushed the availabilityZone branch 3 times, most recently from 9cdb28e to b95ef71 Compare August 21, 2018 23:55
@ritazh ritazh changed the title [WIP] Availability zone support for k8s Availability zone support for k8s Aug 22, 2018
if a.AgentPoolProfiles[i].Count < len(a.AgentPoolProfiles[i].AvailabilityZones)*2 {
return errors.New("The node count and the number of availability zones provided can result in zone imbalance. To achieve zone balance, each zone should have at least 2 nodes or more. ")
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a validation for singlePlacementGroups for VMSS only?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -38,6 +41,11 @@
"name": "[variables('{{.Name}}VMSize')]"
},
"properties": {
{{if UseSinglePlacementGroup .}}
"singlePlacementGroup": true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be:
"singlePlacementGroup": {{UseSinglePlacementGroup .}} ,

@@ -38,6 +41,11 @@
"name": "[variables('{{.Name}}VMSize')]"
},
"properties": {
{{if UseSinglePlacementGroup .}}
"singlePlacementGroup": true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -555,6 +556,33 @@ func setMasterNetworkDefaults(a *api.Properties, isUpgrade bool) {
}
}

// setVMSSDefaults
// singlePlacementGroup = false, the scale set can be composed of multiple placement groups and has a range of 0-1,000 VMs
// singlePlacementGroup = true,, default value, a scale set is composed of a single placement group, and has a range of 0-100 VMs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra , here

// singlePlacementGroup = true,, default value, a scale set is composed of a single placement group, and has a range of 0-100 VMs
// Large scale sets require Azure Managed Disks.
// https://docs.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-placement-groups
// For availability zones, only standard load balancer is supported.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these comments should be added to the docs instead (https://github.com/Azure/acs-engine/blob/master/docs/clusterdefinition.md)

Copy link
Contributor

@CecileRobertMichon CecileRobertMichon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@acs-bot
Copy link

acs-bot commented Aug 29, 2018

New changes are detected. LGTM label has been removed.

@ritazh
Copy link
Member

ritazh commented Aug 30, 2018

rebased, jenkins e2e passed. ready for another round of review

a.OrchestratorProfile.KubernetesConfig.LoadBalancerSku = "Standard"
a.OrchestratorProfile.KubernetesConfig.ExcludeMasterFromStandardLB = helpers.PointerToBool(api.DefaultExcludeMasterFromStandardLB)
}
if profile.SinglePlacementGroup == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should move up to be above line 570, so that if we ever change DefaultSinglePlacementGroup to false we get the foo in the if false set to managed disk thing.

return *profile.SinglePlacementGroup
},
"HasAvailabilityZones": func(profile *api.AgentPoolProfile) bool {
return profile.AvailabilityZones != nil && len(profile.AvailabilityZones) > 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's create a convenience function hasAvailabilityZones that this templat func calls...

@@ -173,6 +173,9 @@ func getParameters(cs *api.ContainerService, generatorCode string, acsengineVers
for _, agentProfile := range properties.AgentPoolProfiles {
addValue(parametersMap, fmt.Sprintf("%sCount", agentProfile.Name), agentProfile.Count)
addValue(parametersMap, fmt.Sprintf("%sVMSize", agentProfile.Name), agentProfile.VMSize)
if agentProfile.AvailabilityZones != nil && len(agentProfile.AvailabilityZones) > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and then we just call hasAvailabilityZones from here

(sorry, the first part of the elipsis is below ;) )

Basically, that way HasAvailabilityZones in the params template directly correlates to the criteria here. We always want to define the param, and then inject it using this addValue function, according to the exact same criteria.

@jackfrancis
Copy link
Member

@ritazh Looks great, added some final comments. Thanks so much!

}

if sv.LT(minVersion) {
return errors.Errorf("availabilityZone is only available in Kubernetes version 1.12 or greater.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This should be errors.New

@jackfrancis
Copy link
Member

/lgtm thanks @ritazh and @sozercan !!!

@acs-bot
Copy link

acs-bot commented Aug 30, 2018

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: CecileRobertMichon, jackfrancis, sozercan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [CecileRobertMichon,jackfrancis]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jackfrancis jackfrancis merged commit 9897327 into Azure:master Aug 30, 2018
@ghost ghost removed the in progress label Aug 30, 2018
@sozercan sozercan deleted the availabilityZone branch August 30, 2018 20:48
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multizone support?
8 participants