Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge remote-tracking branch 'upstream/main' into merge-upstream-v0.20.0 #7

Merged
merged 27 commits into from
Oct 18, 2024

Conversation

yuchen-db
Copy link

@yuchen-db yuchen-db commented Oct 9, 2024

johannaratliff and others added 27 commits April 4, 2024 12:54
* Update dependencies

* Correct PR number
…fana#146)

This addresses a bug in rollout-operator where:

1. Kubernetes receives a request to downscale a statefulset by `X` hosts.
2. The prepare-downscale admission webhook attempts to prepare `X` pods for shutdown by sending an HTTP `POST` to their handler identified by the `grafana.com/prepare-downscale-http-path` and `-port` annotations.
3. At least one of these requests fails. The admission webhook returns an error to Kubernetes, so the downscale is not approved.
4. 💥 But some hosts may have been prepared for downscale. 💥 

This PR adds cleanup logic to issue `DELETE` requests on all involved pods if any of the `POST`s failed. Notes:
* `DELETE` calls are attempted once.
* `DELETE` failures are logged but otherwise ignored.
* For simplicity, we'll invoke `DELETE` on all of the pods involved in the scaledown operation, not just ones that received a POST.

This doesn't fix the similar issue where replica count changing from 10->9->10 leaves that one pod prepared for shutdown. (But that's in the works.)
Add a changelog entry for grafana#146, and prepare changelog for v0.16.0.


Co-authored-by: Patryk Prus <[email protected]>

---------

Co-authored-by: Patryk Prus <[email protected]>
* Swap base image from alpine to distroless

* Remove user setup

* Use nonroot image

* Add different base image for boringcrypto

* Add changelog entry
For better debuggability when there are concurrent webhook calls.
* Include UserInfo.Username in 'handling request' log.

* Changelog.
* Add support for specifying percentage in rollout-max-unavailable annotation.

* CHANGELOG.md
Fix unbalanced pairs in log, leading to a log message like this:
`level=error ts=2024-06-13T03:30:49.769575693Z pod=ingester-zone-a-16 url=http://ingester-zone-a-16.ingester-zone-a.mimir-dev.svc.cluster.local./ingester/prepare-partition-downscale errorsendingHTTPPOSTrequesttoendpoint=err`
* When checking downscale delay in the statefulset allow downscale if some pods at the end of statefulset are ready to be downscaled.

* CHANGELOG.md
…s to store (grafana#151)

Fix a snag found in grafana#146 where if the "downscaled" annotation/configmap fails to persist, the scale operation is denied, but the pods are not informed via DELETE that they should no longer shutdown.
)

* Only scale up zone after leader zone replicas are ready

* Update CHANGELOG

* Change to only scaling once all replicas are ready

* Rename config annotation

* Add log line

* remove redundant test

* Update changelog
* Update dependencies

* Update CHANGELOG

* Fix build errors

* Upgrade docker and grpc for remaining CVEs
* Update Go to 1.23

* Add some nolint
* Added grafana.com/rollout-mirror-replicas-from-resource-update-status-replicas annotation to optionally disable patching of reference resource when using scaling based on reference resource.

* Review findings.

* CHANGELOG entry.
…-status-replicas` annotation (grafana#171)

* Renamed `grafana.com/rollout-mirror-replicas-from-resource-write-back-status-replicas` annotation to `grafana.com/rollout-mirror-replicas-from-resource-write-back`

* Fix changelog.
@yuchen-db yuchen-db changed the title merge upstream v0.20.0 downscale Oct 13, 2024
@yuchen-db yuchen-db force-pushed the yuchen-db/merge-upstream-v0.20.0 branch from 0a62d87 to 36b3396 Compare October 13, 2024 11:30
@yuchen-db yuchen-db changed the title downscale Update downscale logic to support custom port and service name Oct 14, 2024
@yuchen-db yuchen-db marked this pull request as ready for review October 14, 2024 23:04
@yuchen-db yuchen-db requested review from a team, christopherzli, jnyi, hczhu-db and yulong-db and removed request for a team October 14, 2024 23:04
@yuchen-db yuchen-db force-pushed the yuchen-db/merge-upstream-v0.20.0 branch from 36b3396 to 7d10753 Compare October 17, 2024 21:25
@yuchen-db yuchen-db changed the title Update downscale logic to support custom port and service name Merge remote-tracking branch 'upstream/main' into merge-upstream-v0.20.0 Oct 17, 2024
@yuchen-db yuchen-db merged commit 3f1e1e2 into db_main Oct 18, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants