Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wait for readiness of newly created Pods before passing them to KafkaRoller #10746

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

scholzj
Copy link
Member

@scholzj scholzj commented Oct 21, 2024

Type of change

  • Enhancement / new feature

Description

Right now, in the KafkaReconciler we create/update the PodSet and then we immediately move on to KafkaRoller. That sometimes means that any newly spawned Pods are still starting and sometimes when the startup takes longer they confuse the KafkaRoller into failing because it identifies the Pod as stuck.

This PR inserts a wait between the PodSet creation/update and the KafkaRoller. That should help to avoid these issues. If for some reason the Pod does not get ready, the reconciliation will fail. But the next one will not see these Pods as new Pods anymore and will get to KafkaRoller which can fix them.

(This is a logic we already had implemented in the past when using StatefulSets.)

Checklist

  • Write tests
  • Make sure all tests pass
  • Try your changes from Pod inside your Kubernetes and OpenShift cluster, not just locally

@scholzj scholzj added this to the 0.45.0 milestone Oct 21, 2024
@scholzj
Copy link
Member Author

scholzj commented Oct 21, 2024

/azp run regression

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@scholzj
Copy link
Member Author

scholzj commented Oct 23, 2024

/azp run regression

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@scholzj scholzj force-pushed the wait-for-readiness-of-newly-scaled-pods branch from 5231e16 to f5df40c Compare October 31, 2024 19:57
@scholzj
Copy link
Member Author

scholzj commented Nov 1, 2024

/azp run regression

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@scholzj
Copy link
Member Author

scholzj commented Nov 1, 2024

/azp run zookeeper-regression

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@scholzj scholzj marked this pull request as ready for review November 1, 2024 13:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant