Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove ClusterClaim and update ClusterSet version #5001

Merged
merged 1 commit into from
Jul 11, 2023

Conversation

luolanzone
Copy link
Contributor

@luolanzone luolanzone commented May 18, 2023

  1. Remove ClusterClaim CRD from manifests
  2. Update ClusterSet version from v1alpha1 to v1alpha2, and add a new field ClusterID
    in ClusterSet CRD to define a cluster's ID, new ClusterSet CR's name must match ClusterSetID.
  3. Update Multi-cluster docs.

@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@luolanzone luolanzone added the area/multi-cluster Issues or PRs related to multi cluster. label May 19, 2023
@luolanzone luolanzone added this to the Antrea v1.13 release milestone May 22, 2023
@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@luolanzone luolanzone changed the title [WIP]Remove ClusterClaim and update ClusterSet spec Remove ClusterClaim and update ClusterSet spec May 23, 2023
@luolanzone luolanzone added the action/release-note Indicates a PR that should be included in release notes. label May 25, 2023
@luolanzone luolanzone force-pushed the simplify-clusterset branch 2 times, most recently from db0af43 to ef8bcaf Compare May 31, 2023 08:03
@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@luolanzone luolanzone force-pushed the simplify-clusterset branch 2 times, most recently from 84a8c2c to b551bd5 Compare June 13, 2023 06:30
@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

// DEPRECATED
// This field is not used for ClusterSet setup and planned to be removed
// in the future releases.
Members []LeaderClusterInfo `json:"members,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we change API version, should we just remove it? Do you know what will happen with existing old version objects if we remove this field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new CRD should have no impact to old version objects. This should be removed safely in new version, I will refine and double check.


#### Set up Access to Leader Cluster

We first need to set up access to the leader cluster's API server for all member
clusters. We recommend creating one ServiceAccount for each member for
fine-grained access control.

The Multi-cluster Controller deployment manifest for a leader cluster also creates
a default member cluster toke. If you prefer to use the default token, you can skip
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

toke -> token

ClusterSet ID, and a new `clusterID` field specifies the Cluster ID.

To upgrade from a version older than v1.13, once you apply the latest manifests for an
existing ClusterSet, you need to recreate the `ClusterSet` CR manually with `clusterID`, and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not get what you mean. You said "The old CRs will take no affect." After upgrade, mc-controller will not recognize old version CRs? Then the previous ClusterSet is lost? Then what will happen for MC Gateway, ResourceExport/Import, etc.? How about ongoing MC traffic?

Copy link
Contributor Author

@luolanzone luolanzone Jun 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existing traffic shouldn't be impacted if there is no Gateway, Endpoint changes etc. Any new changes for exported Services or Gateways or Pods will be ignored during upgrade. Previous ClusterSet won't be processed since they are different versions unless we add a ClusterSet conversion webhook to update it to a new version's CR.
If we add a ClusterSet conversion, we need get the ClusterID from a ClusterClaim, I feel it don't make sense to do a version upgrade based on two CRs. So I feel we may alert users that this is a breaking change and need to add new ClusterSet manually, what's your thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the new MC Controller starts, nothing will happen if it does not see a valid ClusterSet CR?

And what will happen if there is any change in the ClusterSet (in local or remote or leader cluster) before the new ClusterSet CR is created? E.g. Service Endpoints change, a MC Service or Gateway is deleted / created, a new Pod is created that affects label identities, ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if we didn't provide a ClusterSet conversion webhook, no remote common area will be created in antrea-mc-controller since MC controller won't recognize the old ClusterSet CR. Any change event reconcile will fail until a valid ClusterSet CR is created. The failed events will be re-queued and retry until a valid remote common area is available.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing will happen for existing ResourceExport, ResourceImport, ServiceExport, etc.? For example, leader wont remove any Export/Import for a member? If that is the case, does it mean nothing will happen for existing Export/Import either, even we remove a ClusterSet CR from leader or member? Then how can users cleanup a ClusterSet? They need to manually delete all Export/Import CRs in leader and members? That does not sound like a good experience.

For upgrade, I wonder why not create a new version ClusterSet CR, before updating MC Controller deployment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jianjuns yeah, I am afraid user needs to clean existing resources by themself at the moment, I think we discussed resources clean up before when a member cluster leave a ClusterSet, but there was no conclusion and was postponed. I feel we may prioritize this part in next release.
For upgrade, user can create a new version of CR before updating deployment, but I feel we may need to provide a way to split CRDs update and deployment update first. For now, when user is trying to run kubectl apply with new manifests, both CRDs and MC controller deployment will be updated almost at the same time.
I will check if there is a proper way to split them, let me know if you have other ideas. Thanks.

Copy link
Contributor Author

@luolanzone luolanzone Jun 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After run some verification locally, I think we need keep both versions for ClusterSet if we want to support upgrade for existing ClusterSet since the CR of version v1alpha1 is still kept in the cluster.
I feel we may still need a ClusterSet conversion controller based on existing ClusterClaims and ClusterSet. Maybe not a webhook, just a controller like stale_controller to watch two CRs and create a new version of ClusterSet if possible.
what's your suggestion?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For cleanup, it should be ok for users for remove CRs created by them, but not good to require them to remove CRs created by MC Controller. Let us track that change. And before we have an automated way, could we at least document steps to remove all CRs?

For upgrade, sure probably let us add a one time step to create new version CRs. Let us do it in another PR.

Copy link
Contributor Author

@luolanzone luolanzone Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc updated, add a few commands for user as a reference to create a new version CR, and the step to remove ClusterSet in the doc.

@luolanzone luolanzone force-pushed the simplify-clusterset branch 2 times, most recently from 6a39035 to a6d4f9a Compare July 4, 2023 13:45
@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we create an issue to track this one:

For ClusterSet cleanup (or cleanup in leader when a member leaves the Cluster), it should be ok for users for remove CRs created by them, but not good to require them to remove CRs created by MC Controller. Let us track that change. And before we have an automated way, could we at least document steps to remove all CRs?

@@ -33,6 +33,10 @@ through tunnels among clusters. The ClusterNetworkPolicy replication feature is
supported since Antrea v1.6.0, and Multi-cluster NetworkPolicy rules are
supported since Antrea v1.10.0.

Antrea v1.13 introduced incompatible changes to ClusterSet CRDs. If you plan to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should say "Antrea v1.13 promoted the ClusterSet CRD version from v1alpha1 to v1alpha2"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

the `ClusterSet` CRD solely defines a ClusterSet. The name of a ClusterSet CR must match the
ClusterSet ID, and a new `clusterID` field specifies the Cluster ID.

To upgrade from a version older than v1.13, once you apply the latest manifests for an
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we should import new CRD version and update ClusterSet CRs first before deploying the new version MC Controller. If you plan to add auto-conversion in this release, let us update the document after that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I will add a new PR to do auto-conversion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then how about let us add upgrade documentation with that PR and remove the changes in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new PR is created #5216, and update the upgrade doc in the new PR.

@luolanzone
Copy link
Contributor Author

A new issue #5203 is created for resources clean up.

@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

5 similar comments
@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@luolanzone luolanzone force-pushed the simplify-clusterset branch 3 times, most recently from 3f64b59 to b2ee015 Compare July 6, 2023 14:51
@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@luolanzone luolanzone changed the title Remove ClusterClaim and update ClusterSet spec Remove ClusterClaim and update ClusterSet version Jul 7, 2023
1. Remove ClusterClaim CRD from manifests.
2. Update ClusterSet version from v1alpha1 to v1alpha2, and add a new field ClusterID
in ClusterSet CRD to define a cluster's ID, the new ClusterSet CR's name must match ClusterSetID.
3. Update Multi-cluster docs.

Signed-off-by: Lan Luo <[email protected]>
@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@jianjuns
Copy link
Contributor

/skip-all

@jianjuns
Copy link
Contributor

@luolanzone : is the PR ready to merge?

@luolanzone
Copy link
Contributor Author

@jianjuns yes, this PR is ready for merge. The remain thing is the ClusterSet conversion controller in #5216, if we are not so sure to modify or delete existing ClusterSet CRs, I think I can create another PR to update the upgrade doc with manual steps.

@jianjuns jianjuns merged commit b3aa3a0 into antrea-io:main Jul 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/release-note Indicates a PR that should be included in release notes. area/multi-cluster Issues or PRs related to multi cluster.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants