-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
flake: timeout reached waiting for pod IDs in ipcache of Cilium pod #361
Comments
Occured again on I'll open a PR enabling debugging options for connectivity tests to maybe get some more insight into what part of |
This should give us some additional information to debug flakes, e.g. for #361 Signed-off-by: Tobias Klauser <[email protected]>
This should give us some additional information to debug flakes, e.g. for #361 Signed-off-by: Tobias Klauser <[email protected]>
Add error to the debug output for ipcache check failures. This should help solve #361. Signed-off-by: Jarno Rajahalme <[email protected]>
Add error to the debug output for ipcache check failures. This should help solve #361. Signed-off-by: Jarno Rajahalme <[email protected]>
In this case the fail is due to this timing out:
The pod
Looking at the
Since this is a flake, it is possible that the ipcache entry was populated right after the last check. To see if something else is going on, #444 adds the error string to the appropriate debug messages. |
Add error to the debug output for ipcache check failures. This should help solve #361. Signed-off-by: Jarno Rajahalme <[email protected]>
Add error to the debug output for ipcache check failures. This should help solve #361. Signed-off-by: Jarno Rajahalme <[email protected]>
I've spotted this in a GKE multi-cluster test:
Deleting the reported pod and restarting the Cilium agents did not seem to help:
I am attaching a sysdump of each cluster: |
I think this is a good suggestion from @tklauser:
|
Temporarily disabling the ipcache check while we investigate what's causing the flake. Ref: #361 Signed-off-by: Michi Mutsuzaki <[email protected]>
Add `--skip-ip-cache-check` flag with the default set to true. This is meant to be a temporary flag while we investigate what's causing the flake. Ref: #361 Signed-off-by: Michi Mutsuzaki <[email protected]>
Add `--skip-ip-cache-check` flag with the default set to true. This is meant to be a temporary flag while we investigate what's causing the flake. Ref: #361 Signed-off-by: Michi Mutsuzaki <[email protected]>
Follow-up for #503 to address #503 (comment) Also add a comment so we don't forget to re-enable the check again once issue #361 is resolved. Signed-off-by: Tobias Klauser <[email protected]>
Follow-up for #503 to address #503 (comment) Also add a comment so we don't forget to re-enable the check again once issue #361 is resolved. Signed-off-by: Tobias Klauser <[email protected]>
Following community meeting: closing since we disabled the ipcache check in the CLI. |
This should give us some additional information to debug flakes, e.g. for cilium#361 Signed-off-by: Tobias Klauser <[email protected]>
Add error to the debug output for ipcache check failures. This should help solve cilium#361. Signed-off-by: Jarno Rajahalme <[email protected]>
Add `--skip-ip-cache-check` flag with the default set to true. This is meant to be a temporary flag while we investigate what's causing the flake. Ref: cilium#361 Signed-off-by: Michi Mutsuzaki <[email protected]>
Follow-up for cilium#503 to address cilium#503 (comment) Also add a comment so we don't forget to re-enable the check again once issue cilium#361 is resolved. Signed-off-by: Tobias Klauser <[email protected]>
flake instances
symptoms
connectivity check fails with an error like this:
other notes
a related upstream issue: cilium/cilium#16542
The text was updated successfully, but these errors were encountered: