Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance Etcd configMap to provide consistency and controllability #771

Closed

Conversation

anveshreddy18
Copy link
Contributor

@anveshreddy18 anveshreddy18 commented Feb 27, 2024

How to categorize this PR?

/area quality
/kind cleanup
/kind enhancement

What this PR does / why we need it:

This PR is to cleanup the Etcd ConfigMap to :

  • Rename the configMap with the prefix of etcd.Name instead of just having etcd. This is for consistency with other resources that gets created as part of Etcd Reconciliation, like leases, role, rolebinding, etc. This also involves cleanup logic for the existing configMap with older name.
  • Make snapshot-count configurable with the flag snapshotCount from spec.etcd section of etcd CR thus the operator is able to control the no.of transactions before committing a snapshot to disk if they want to.
  • Use proper URLs for initial-advertise-peer-urls and advertise-client-urls properties of the configMap and thereby fixing [Enhancement] Simplify etcd configuration and use proper URL instead of using @ as separator #476

Which issue(s) this PR fixes:
Fixes #717

Special notes for your reviewer:

Release note:

Enhances Etcd configuration by organizing ConfigMap naming convention, enabling snapshot-count configuration, and rectifying URL issues for improved functionality and consistency

@anveshreddy18 anveshreddy18 requested a review from a team as a code owner February 27, 2024 11:26
@gardener-robot gardener-robot added needs/review Needs review area/quality Output qualification (tests, checks, scans, automation in general, etc.) related kind/cleanup Something that is not needed anymore and can be cleaned up kind/enhancement Enhancement, improvement, extension size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) labels Feb 27, 2024
@gardener-robot-ci-2 gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Feb 27, 2024
@anveshreddy18
Copy link
Contributor Author

/hold until backup-restore PR #715 is merged.

@gardener-robot gardener-robot added the reviewed/do-not-merge Has no approval for merging as it may break things, be of poor quality or have (ext.) dependencies label Feb 27, 2024
@anveshreddy18 anveshreddy18 self-assigned this Feb 27, 2024
Copy link
Contributor

@unmarshall unmarshall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of these changes we will have to be re-done once the #728 is opened up for review. This is a bit unfortunate. What i would suggest is either to include these changes in the refactor PR OR do it after that gets merged.

What happens for configmaps that are still having the old naming convention for all existing etcd-clusters created for shoots which are either active or hibernated?

@@ -443,7 +446,7 @@ func (e *Etcd) GetServiceAccountName() string {

// GetConfigmapName returns the name of the configmap for the Etcd.
func (e *Etcd) GetConfigmapName() string {
return fmt.Sprintf("etcd-bootstrap-%s", string(e.UID[:6]))
return fmt.Sprintf("%s-bootstrap-%s", e.Name, (e.UID[:6]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should just name this as <etcd-name>-config-<etcd UID suffix>. There is no need of including bootstrap as part of the name IMHO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I left it like that is because the closed PR tried to do the same and since this is a revival of it, I went with what they proposed in it. But yeah that is good, I'll update it.

@@ -208,6 +208,9 @@ type EtcdConfig struct {
// Quota defines the etcd DB quota.
// +optional
Quota *resource.Quantity `json:"quota,omitempty"`
// SnapshotCount defines the number of committed transactions to trigger a snapshot to disk
// +optional
SnapshotCount *int `json:"snapshotCount,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. once we include this as a configurable parameter, we will also have to check what happens when this parameter is updated since this is now going to be part of an etcd spec.
  2. The description can simply be: configures the number of applied Raft entries to hold in-memory before compaction and also provide a link to https://etcd.io/docs/v3.4/op-guide/maintenance/#raft-log-retention

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will also have to check what happens when this parameter is updated since this is now going to be part of an etcd spec.

Sorry, I didn't get this. If the param is updated, then in the next reconciliation it will be reflected in the configMap. Is your question about how etcd container reacts to this change? I've tried this locally and it didn't affect the cluster in anyway. Pls correct me if this is about something else.

The description can simply be: configures the number of applied Raft entries to hold in-memory before compaction

Sure, this is more readable. I'll update.

@@ -563,6 +563,10 @@ spec:
serverPort:
format: int32
type: integer
snapshotCount:
description: SnapshotCount defines the number of committed transactions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adjust the description.

@anveshreddy18
Copy link
Contributor Author

@unmarshall thanks for the review.

either to include these changes in the refactor PR OR do it after that gets merged

That's fine, if that PR decides to incorporate these, then I'll close this. Else, we can merge this after that.

What happens for configmaps that are still having the old naming convention for all existing etcd-clusters created for shoots which are either active or hibernated?

For that, some temporary code has been added here to delete the old configmaps during the reconciliation. That can be removed once all the configmaps of every shoot namespace are deleted, probably after a few releases.

@shreyas-s-rao shreyas-s-rao added this to the v0.24.0 milestone Mar 26, 2024
@gardener-robot gardener-robot added status/closed Issue is closed (either delivered or triaged) size/xs Size of pull request is tiny (see gardener-robot robot/bots/size.py) labels Jun 24, 2024
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jun 24, 2024
@gardener-robot gardener-robot removed the size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) label Jun 24, 2024
@anveshreddy18
Copy link
Contributor Author

anveshreddy18 commented Jun 25, 2024

This PR was closed automatically by Github due to diff becoming null at some point while rebasing the PR on #777. I didn't think about reopening it and instead created a new PR after rebasing. The new PR is #812, for reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/quality Output qualification (tests, checks, scans, automation in general, etc.) related kind/cleanup Something that is not needed anymore and can be cleaned up kind/enhancement Enhancement, improvement, extension needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) needs/review Needs review reviewed/do-not-merge Has no approval for merging as it may break things, be of poor quality or have (ext.) dependencies reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) size/xs Size of pull request is tiny (see gardener-robot robot/bots/size.py) status/closed Issue is closed (either delivered or triaged)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

☂️ Etcd ConfigMap changes
6 participants