Skip to content
This repository has been archived by the owner on Sep 30, 2020. It is now read-only.

Allow more control over ASG definition #142

Merged
merged 4 commits into from
Dec 14, 2016

Conversation

cknowles
Copy link
Contributor

@cknowles cknowles commented Dec 8, 2016

Min/max size and min instances in service can now be directly defined.
Defaults taken from the older controllerCount and workerCount

Min/max size and min instances in service can now be directly defined.
Defaults take from the older `controllerCount` and `workerCount`
@codecov-io
Copy link

codecov-io commented Dec 8, 2016

Current coverage is 69.68% (diff: 100%)

Merging #142 into master will decrease coverage by 0.22%

@@             master       #142   diff @@
==========================================
  Files             5          4     -1   
  Lines          1110       1095    -15   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
- Hits            776        763    -13   
+ Misses          249        248     -1   
+ Partials         85         84     -1   

Powered by Codecov. Last update 56f08d5...a3d586a

@@ -9,3 +9,6 @@
coverage.txt
profile.out
/.glide

# files generated during tests
config/temp
Copy link
Contributor

@mumoshu mumoshu Dec 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!
But I've fixed test helper in 194ff10#diff-490c12c05aa246994472cb51aa5c358bR22 so if you've rebased to master, tests don't leave garbages like config/temp, nodepool/config/temp anymore.
Would you mind undoing this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind removing but one thing is if viewing git diff during a test run, it shows many uncommitted files before this. I think that could be a good reason to leave it in. I'm using some Atom extensions that mean even just saving will run the tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase to the latest master and rm -Rf config/temp nodepool/config/temp and these uncommitted files won't appear again 💮
If it appears, it is a bug. I'd rather like to notice bugs instead of ignoring them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if anything was changed on latest master that would help this as master hasn't changed since this comment? Anyway, it still happens off latest - my local save runs the tests and if I happen to go to git diff while that is running it shows many files in config/temp. Not a big problem, but it is still there.

Copy link
Contributor

@mumoshu mumoshu Dec 13, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the info!
That's odd 😭
Would you mind reverting this and then raising an another issue specific to that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}
if config.WorkerASG.RollingUpdateMinInstancesInService == 0 {
config.WorkerASG.RollingUpdateMinInstancesInService = config.WorkerASG.MaxSize - 1
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we go this way, I guess we also need to set defaults for WorkerASG in nodepool/config/config.go or we'll end up with all of MinSize, MaxSize, RollingUPdateMinInstancesInService to be kept as is - 0 for node pools, hence stack creation error.

Copy link
Contributor

@mumoshu mumoshu Dec 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I guess we can do this more immutably(=no need to mutate WorkerASG.RollingUpdateminInstancesInService) and DRY(=no need to copy-paste this to nodepool/config/config.go) by wrapping RollingUpdateMinInstancesInService field into a func named the same.
Wrote a snippet to show my thoughts: https://play.golang.org/p/c_0Pu_UDMf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As yes you are right, I had thought nodepools config was inherited from the main one. I want to share it somehow as you say so will look into it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only static, default values set in NewDefaultCluster() are inherited to node pool config.
IMHO currently there's no timing to inherit "computed" values like these.
That is one of reasons I believe it is better to implement "computed" values by wrapping them into functions, instead of mutating values in place. ref https://github.com/coreos/kube-aws/pull/142/files#r91641540

// setup worker ASG size, one is the default WorkerCount
if config.WorkerCount != 1 && (config.WorkerASG.MinSize != 0 || config.WorkerASG.MaxSize != 0) {
return nil, fmt.Errorf("`workerASG.MinSize` and `workerASG.MaxSize` can only be specified without `workerCount`")
}
Copy link
Contributor

@mumoshu mumoshu Dec 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A validation like this should be done in func (c WorkerSettings) Valid()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. I missed that, I'm sure as you are aware it is a little hard to know about the style with old+new code together and such a big file! I will fix it.

}
if config.WorkerASG.MinSize == 0 {
config.WorkerASG.MinSize = config.WorkerCount
}
Copy link
Contributor

@mumoshu mumoshu Dec 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we just alter func (c WorkerSettings) MinWorkerCount() int to achieve the same end result?

e.g.

func (c WorkerSettings) MinWorkerCount() int {
	if c.WorkerASG.MinSize == 0 {
		return c.WorkerCount
	}
	return c.WorkerASG.MinSize
	// Instead of:
	// return c.WorkerCount - 1
}

So that we don't even need to modify stack-template.json and could keep backward-compatibility of what are (exposed to|usable from) stack-template.json, while no need to copy-pasting some code from config/config.to to nodepool/config/config.go like I've suggested in my prev comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would avoid keeping MinWorkerCount for two reasons:

  1. I think it's better if the layout of the names in config cluster.yaml match those stack-template.json
  2. Regarding backwards compatibility, MinWorkerCount was only introduced in the new kube-aws repo so I think the impact is minimal for anyone with major modifications as they would have to do similar to I have done (switch over and change quite a lot). I don't think we should maintain too many options/functions that do the same thing.

What are you thoughts assuming I do the function/DRY work above?

(Side note, this relates to me wanting to remove the old workerCount too as it's trivial for the user to update to min/max compared to keeping all so many different options available)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I basically agree to both 1. and 2. in your comment. I also agree to remove workerCount.
However I'd like them to be deprecated first and then dropped, in the future, not in this pull request. Adding/improving something and dropping existing feature/setting/option should be done in separate pull requests IMHO.

To be extra clear, I'm not satisfied with neither MinWorkerCount nor MaxWorkerCount as of today but that doesn't mean I want to drop them immediately, hence I'd like to leave MinWorkerCount, MaxWorkerCount and workerCount "for now".

}
if config.WorkerASG.MaxSize == 0 {
config.WorkerASG.MaxSize = util.IntMax(config.WorkerCount, config.WorkerASG.MinSize)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same as https://github.com/coreos/kube-aws/pull/142/files#r91637776 for func (c WorkerSettings) MaxWorkerCount() int

config.WorkerASG.MinSize = config.WorkerCount
}
if config.WorkerASG.MaxSize == 0 {
config.WorkerASG.MaxSize = util.IntMax(config.WorkerCount, config.WorkerASG.MinSize)
Copy link
Contributor

@mumoshu mumoshu Dec 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we'd like to add a validation like this to func (c WorkerSettings) Valid() error rather than silently fixing invalid configuration (i.e. MinSize > WorkerCount).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do. What do you want the expected behaviour to be for the setup in TestASGsMinConfigured?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A validation error when MinSize > WorkerCount would be nice

@@ -17,19 +17,19 @@
"{{$subnet.AvailabilityZone}}"
{{end}}
],
"DesiredCapacity": "{{.WorkerCount}}",
"DesiredCapacity": "{{.WorkerASG.MinSize}}",
Copy link
Contributor

@mumoshu mumoshu Dec 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably you've not intended this?
I guess we should leave this {{.WorkerCount}} or we become unable set desired capacity

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My assumption is desired capacity is usually min size for cluster rollout? Afterwards AWS changes the desired capacity when we implement auto scaling. We could allow this to be customised as well? I can't remember the AWS behaviour if desired capacity is outside min/max etc. Right now I just replaced like for like so we could leave the extra config for later, i.e. it was the min count before and it is still the min count.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. However if that's your intention it should be achieved in an another way.

DesiredCapacity is not a required parameter.
If you want to leave your initial ASG size as same as MinSize, you could just omit "DesiredCapacity": "{{.WorkerASG.MinSize}}", so stack-template would look like:

{{if gt .WorkerCount 0}}"DesiredCapacity": "{{.WorkerCount}}",{{end}}

With this, If workerCount is omitted from cluster.yaml, DesiredCapacity is automatically set to MinSize by ASG itself and we're even unable to update DesiredCapacity via cfn stack updates. Thus, now, the user is responsible to update DesiredCapacity outside of kube-aws or CloudFormation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention of this change is that the new WorkerASG.MinSize has the same value as the older WorkerCount. So if we want to omit it for that case, I would just omit it altogether and that would clarify that we currently do not support configuration of DesiredCapacity. What do you think?

It depends whether you want the DesiredCapacity to be set to the exact value it is prior to this change which it is MinSize is always set, defaulting to WorkerCount. In turn, this also keeps stack-template.json clean as it only uses the ASG properties rather than having multiple levels of defaulting/switching inside code and templates.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to give users option to choose either they modify DesiredCapacity inside or outside of CloudFormation plus I'd like not to break the existing use-case that set DesiredCapacity via workerCount.
We could add support for MinSize and MaxSize without removing such option or breaking existing use-case hence #142 (comment)

Copy link
Contributor

@mumoshu mumoshu Dec 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick thought but do you prefer entirely removing DesiredCapacity from stack-template.json in this pr and later(in an another issue/pr) improve kube-aws update command to support updating DesiredCapacity outside of cfn like e.g. kube-aws update --num-workers 10?

Doing this allows users to choose either updating desired capacity via kube-aws or not, anyways.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so. Since it's currently set to the same size as min and AWS will default to that anyway, it's not currently doing much.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK so let's remove DesiredCapacity then 👍

Btw, I've noticed that omitting DesiredCapacity here would make the situation easier for #148 too.

@@ -99,19 +99,19 @@
"{{$subnet.AvailabilityZone}}"
{{end}}
],
"DesiredCapacity": "{{.ControllerCount}}",
"DesiredCapacity": "{{.ControllerASG.MinSize}}",
Copy link
Contributor

@mumoshu mumoshu Dec 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# workerASG:
# minSize: 1
# maxSize: 3
# rollingUpdateMinInstancesInService: 2
Copy link
Contributor

@mumoshu mumoshu Dec 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we adapt these to the new worker: key I've recently added for spot fleets so that we can have more consistency/distinction between 2 deployment strategy: auto scaling group and spot fleet?

#worker:
#  # Auto Scaling Group definition for workers. If only `workerCount` is specified, min and max will be the set to that value and `rollingUpdateMinInstancesInService` will be one less.
#  autoScalingGroup:
#    minSize: 1
#    maxSize: 3
#      rollingUpdateMinInstancesInservice: 2
#  # Spot Fleet definitions for workers
#  # You may specify autoScalingGroup or spotFleet, but not both.
#  # Spot Fleet support is still experimental. It can be changed in backward-incompatible ways
#  spotFleet:
#  *snip*

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we can! I saw that and was wondering if it was the way you wanted to go, I can update it. One thing will be we have old settings outside worker and new settings inside. My opinion is we need to deprecate some things in order to ensure a consistent and easy to understand tool.

Copy link
Contributor Author

@cknowles cknowles Dec 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you also want the controller ASG in a controller YAML property? Do you only mean nodepools or both cluster.yaml? We want to keep it consistent right (as it's all read from the same place)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and yes! I'd like to have a new controller property and both worker and controller to be consistently used in both main cluster's and node pools' cluster.yaml

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entire sure how I can get this in without moving worker.go since otherwise there will be a duplicate Worker inside Cluster. Are you ok for me to move it? I saw there was a bunch of placeholder code anyway in config.go so it's pretty messy as it is.

@mumoshu
Copy link
Contributor

mumoshu commented Dec 9, 2016

@c-knowles Thanks for the PR!

Its functionality seems good 👍 however some comments on backward-compatibility, consistency and DRY.
Sorry for not clearly documenting these points anywhere but I believe this is also a good chance for us to share kube-aws' design.
Probably too many comments on a rather simple enhancement functionality-wise but could you take some time to address these?

@cknowles
Copy link
Contributor Author

cknowles commented Dec 9, 2016

@mumoshu great, thanks. I'll start on a few of those changes and I put some follow up comments above.

@mumoshu
Copy link
Contributor

mumoshu commented Dec 9, 2016

I've mistakenly commented via my another github account several times 😅 Never mind about that.

@mumoshu
Copy link
Contributor

mumoshu commented Dec 9, 2016

@c-knowles I've replied to every comment 👍

- Start model package
- Fix up nodepool ASGs as they only inherit static default values
- Move validation into Valid() methods
- Move ASG config into `worker` and `controller`
- Maintain backwards compatibility for now on
MinWorkerCount/MaxWorkerCount
- Increase validation for setting count along with min/max
- Remove desired capacity, it currently do not do much
@cknowles
Copy link
Contributor Author

@mumoshu I believe I've done most of what you requested above, I've started to align worker/controller as well as I was forced to combine those to remove duplicate definitions inside config/nodepools. I'd like to move more of the parsing and validation of the model into that package but I kept it to a minimum for now.

@@ -65,7 +64,7 @@
{{if .Experimental.WaitSignal.Enabled}}
"CreationPolicy" : {
"ResourceSignal" : {
"Count" : "{{.WorkerCount}}",
"Count" : "{{.MinWorkerCount}}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In kube-aws update, this should match the desired capacity at that time rather than min size, or I guess we will end up destroying the former autoscaling group too early the overall cluster capacity temporarily decreased until the rest of worker nodes(DesiredCapacity - MinSize) becomes ready.

So, do you mind that I'll follow up with a pr addressing this after merging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, if that's appropriate. Since this is the CreationPolicy I thought it didn't apply to stack updates at all?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out! You're certainly correct. I've mistakenly mixed things up from the past when we've tried to do node replacements via "create asg and then destroy old one" instead of the current "gradually replace nodes in an asg".

(it's still needed just there is some want to commit it separately)
@mumoshu
Copy link
Contributor

mumoshu commented Dec 14, 2016

@c-knowles Thanks for your continuous efforts 🙇
Before merging this, could you remove util/math.go because it is not referenced from anywhere now?

@cknowles
Copy link
Contributor Author

Done. By the way, I think this resolves #158 as well.

@iameli
Copy link
Contributor

iameli commented Dec 14, 2016

Oh nice, that seems true. Thanks!

@mumoshu mumoshu merged commit 0cc915e into kubernetes-retired:master Dec 14, 2016
@mumoshu
Copy link
Contributor

mumoshu commented Dec 14, 2016

@c-knowles Merged. Thanks for the PR and a lot of follow ups 👍

@cknowles cknowles deleted the feature/asg-min-max branch December 14, 2016 03:39
@mumoshu mumoshu added this to the v0.9.2-rc.5 milestone Dec 14, 2016
kylehodgetts pushed a commit to HotelsDotCom/kube-aws that referenced this pull request Mar 27, 2018
* Allow more control over ASG definition

Min/max size and min instances in service can now be directly defined.
Defaults take from the older `controllerCount` and `workerCount`

* Start to unify worker/controller configuration

- Start model package
- Fix up nodepool ASGs as they only inherit static default values
- Move validation into Valid() methods
- Move ASG config into `worker` and `controller`
- Maintain backwards compatibility for now on
MinWorkerCount/MaxWorkerCount
- Increase validation for setting count along with min/max
- Remove desired capacity, it currently do not do much
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants