Enable LeaderElect for federation controller #394

fisherxu · 2018-11-06T07:55:56Z

Enable LeaderElect for federation controller, then can deployed as multiple instances.
/cc @marun @font @irfanurrehman @shashidharatd

k8s-ci-robot · 2018-11-06T07:56:02Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: fisherxu
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: irfanurrehman

If they are not already assigned, you can assign the PR to them by writing /assign @irfanurrehman in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

gyliu513 · 2018-11-06T09:33:52Z

charts/federation-v2/values.yaml

@@ -2,7 +2,7 @@
 # This is a YAML-formatted file.
 # Declare variables to be passed into your templates.

-## Configuration values for federation v2 controllermanager statefulset.
+## Configuration values for federation v2 controllermanager deployment.


Please check #245 (comment) for why we are using sts here.

/cc @marun

thanks for reminding @gyliu513 :) I have checked the comment, and that's in the absence of leader election. We need deploy multi instances with leader election, and to the multi instances of controller, deployment should be fine like kube-controller-manager.

marun · 2018-11-06T17:14:48Z

I think this kind of feature deserves some discussion, maybe even a design document, before implementation is proposed.

Edit: specifically - I'd like to have the use cases this is intended to target clearly documented and agreed upon before work is merged.

fisherxu · 2018-11-07T09:21:00Z

I'd like to have the use cases this is intended to target clearly documented

@marun Has raised the issue #402 to describe some use cases, and we can have some discussion there :)
And split the deployment to another PR #404.

marun

I would like to see an e2e test of leader election that ensures the code adding by this PR is exercised and won't rot. I'm happy to assist if you would find that useful. I think that will likely involve adding a minimal 3rd test run alongside managed and managed to ensure test isolation.

marun · 2018-11-16T15:44:47Z

cmd/controller-manager/main.go

+		RetryPeriod:   retryPeriod,
+		Callbacks: leaderelection.LeaderCallbacks{
+			OnStartedLeading: run,
+			OnStoppedLeading: func() {


The leader election callback would appear to ensure that a master is started, but how will a master be stopped if it loses the leadership? It appears that a master once started would never be stopped since there is nothing that closes the stop channel.

When starting leadership, the leaderelection.RunOrDie will create the stop channel, and close it when loss of leadership.

That suggests that the stop channel for the non-leader elect path be created in the if !enableLeaderElection so that it's clear that it won't be used outside of that block.

@marun Have updated, now we can pass the stopChan through the ctx to the run func. I have checked in my environment, it works fine :) PTAL.

pmorie · 2018-11-16T16:09:02Z

cmd/controller-manager/main.go

+		Callbacks: leaderelection.LeaderCallbacks{
+			OnStartedLeading: run,
+			OnStoppedLeading: func() {
+				glog.Fatalf("leaderelection lost")


On loss of leadership, it seems like it would be sufficient to send on the stop channel passed to run.

marun · 2018-12-17T22:38:56Z

@fisherxu Do you intend to follow up on this PR? It's been dormant for over a month.

fisherxu · 2018-12-19T02:44:14Z

@marun Sorry for the delay, and I think this PR should depend on this PR #57932, to pass stop channel to run function.
And this PR is merged in client-go v1.12.0, so now the client-go we used(v1.10.1) really can't pass the stop channel.
Do we have any plane to upgrade the client-go? :)

marun · 2018-12-20T04:45:36Z

@fisherxu Thank you for following up!

I'm ok with upgrading to the latest client-go if it's possible. I think you mentioned on slack that upgrading client-go may depend on kubebuilder upgrading?

fisherxu · 2018-12-20T13:39:26Z

@marun I have checked today.
the code(pkg/client/clientset/versioned/typed/GROUP/VERSION/fake) generated by kubebuilder(v1.0.4) is not compatible with client-go(kubernetes-1.13.0), but is compatible with client-go(kubernetes-1.12.0).

the controller-runtime (only used in e2e now) use the v0.1.1 version which is generated by kubebuilder(v1.0.4) , the controller-runtime(v0.1.1) is not compatible with client-go(kubernetes-1.12.0), so can we upgrade it to v0.1.8 manuall as in #530 , although it's generated by kubebuilder(v1.0.4)?

If we upgrade the kubebuilder to 1.0.5 or .6, project maybe will have big changes... but kubebuilder(v1.0.4) can only support up to client-go(kubernetes-1.12.0).

marun · 2019-01-02T16:45:16Z

@fisherxu What changes to the project do you expect with kubebuilder > 1.04? fedv2 doesn't use any kb-generated controllers, we use it to bootstrap new types and maintain the generated deepcopy, clients/listers/informers, and crd yaml. @shashidharatd has agreed to work on removing the need for generated clients/listers/informers, so only the deepcopy and crd yaml will remain.

fisherxu · 2019-01-03T07:00:40Z

@marun @shashidharatd The func NewPatchSubresourceAction in pkg/client/clientset/versioned/typed/GROUP/VERSION/fake generayed by kubebuilder(v1.0.4) (line for example) don't pass the PatchType parameter. But in the latest client-go1.13.+ this func need the PatchType parameter. So upgrade client-go to 1.13 will make the CI failed. Have any suggestion here? :)

marun · 2019-01-03T17:09:38Z

@fisherxu Does that mean that running kubebuilder generate with a version of kubebuilder > 1.04 will fix the problem? If it's more complicated than that, it may be desirable to wait until the generated client is removed from the tree.

fisherxu · 2019-01-07T13:13:08Z

If it's more complicated than that, it may be desirable to wait until the generated client is removed from the tree.

@marun Agree, also desirable to wait until the generated client is removed from the tree.

marun · 2019-01-24T23:33:54Z

@fisherxu Regarding testing the changes in this PR, I suggest writing a managed (fixture is in-process and test-managed) e2e test in which 2 controller managers are started in-process (using etcd fixture), one is killed, and the test validates that the second controller is elected leader.

marun · 2019-01-24T23:36:12Z

Also, removal of the generated client is complicated by the apparent lack of support of the Watch method in controller-runtime's generic client. Do you know of an alternative way of watching without a generated client (without having to adopt the new and largely untested controller scheme of the newer versions of kubebuilder)? Or would it be possible to update the generated clients to be compatible?

shashidharatd · 2019-03-07T03:17:05Z

Hi @fisherxu, Shall we close this pr in favour of #632 ?

fisherxu · 2019-03-07T03:19:42Z

Sure, Let's close this pr in favour of #632.

shashidharatd · 2019-03-07T03:26:03Z

Thanks @fisherxu

k8s-ci-robot requested review from font, irfanurrehman, marun and shashidharatd November 6, 2018 07:55

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 6, 2018

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 6, 2018

gyliu513 reviewed Nov 6, 2018

View reviewed changes

fisherxu mentioned this pull request Nov 8, 2018

Switch statefulset to deployment for federation-v2 controller manager #404

Closed

marun suggested changes Nov 16, 2018

View reviewed changes

pmorie reviewed Nov 16, 2018

View reviewed changes

k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 19, 2018

fisherxu mentioned this pull request Nov 19, 2018

Enable LeaderElect for federation controller #402

Closed

fisherxu mentioned this pull request Dec 20, 2018

Upgrade client go to kubernetes-1.12.4 #530

Merged

fisherxu added 2 commits January 7, 2019 21:08

enable leader select for controller

b6b7802

add vender for leader election

356bc5f

shashidharatd mentioned this pull request Mar 6, 2019

Add leader election to controller-manager #632

Merged

fisherxu closed this Mar 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable LeaderElect for federation controller #394

Enable LeaderElect for federation controller #394

fisherxu commented Nov 6, 2018 •

edited

Loading

k8s-ci-robot commented Nov 6, 2018

gyliu513 Nov 6, 2018

fisherxu Nov 6, 2018

marun commented Nov 6, 2018 •

edited

Loading

fisherxu commented Nov 7, 2018 •

edited

Loading

marun left a comment

marun Nov 16, 2018

fisherxu Nov 19, 2018

marun Nov 19, 2018

fisherxu Jan 7, 2019

pmorie Nov 16, 2018

marun commented Dec 17, 2018

fisherxu commented Dec 19, 2018 •

edited

Loading

marun commented Dec 20, 2018

fisherxu commented Dec 20, 2018 •

edited

Loading

marun commented Jan 2, 2019

fisherxu commented Jan 3, 2019

marun commented Jan 3, 2019

fisherxu commented Jan 7, 2019

marun commented Jan 24, 2019

marun commented Jan 24, 2019

shashidharatd commented Mar 7, 2019

fisherxu commented Mar 7, 2019

shashidharatd commented Mar 7, 2019

Enable LeaderElect for federation controller #394

Enable LeaderElect for federation controller #394

Conversation

fisherxu commented Nov 6, 2018 • edited Loading

k8s-ci-robot commented Nov 6, 2018

gyliu513 Nov 6, 2018

Choose a reason for hiding this comment

fisherxu Nov 6, 2018

Choose a reason for hiding this comment

marun commented Nov 6, 2018 • edited Loading

fisherxu commented Nov 7, 2018 • edited Loading

marun left a comment

Choose a reason for hiding this comment

marun Nov 16, 2018

Choose a reason for hiding this comment

fisherxu Nov 19, 2018

Choose a reason for hiding this comment

marun Nov 19, 2018

Choose a reason for hiding this comment

fisherxu Jan 7, 2019

Choose a reason for hiding this comment

pmorie Nov 16, 2018

Choose a reason for hiding this comment

marun commented Dec 17, 2018

fisherxu commented Dec 19, 2018 • edited Loading

marun commented Dec 20, 2018

fisherxu commented Dec 20, 2018 • edited Loading

marun commented Jan 2, 2019

fisherxu commented Jan 3, 2019

marun commented Jan 3, 2019

fisherxu commented Jan 7, 2019

marun commented Jan 24, 2019

marun commented Jan 24, 2019

shashidharatd commented Mar 7, 2019

fisherxu commented Mar 7, 2019

shashidharatd commented Mar 7, 2019

fisherxu commented Nov 6, 2018 •

edited

Loading

marun commented Nov 6, 2018 •

edited

Loading

fisherxu commented Nov 7, 2018 •

edited

Loading

fisherxu commented Dec 19, 2018 •

edited

Loading

fisherxu commented Dec 20, 2018 •

edited

Loading