-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Waved panicked #2185
Comments
@mturoci Can we know if the port (10101) is still open even though waved is crashed? Because in the MLOps wave app we ping the TCP port as the health check of the container. If the port is still open then the container will still be detected as healthy. cc: @ShehanIshanka |
You can check, but I would be surprised if that was the case. Why would your app crash connecting to waved, but healthcheck would pass? |
Are there any steps to reproduce it locally or on a dev environment? |
Closing due to not being able to repro, seems like a Keycloak misconfiguration. The place where panic happens is caused by Feel free to reopen in case you manage to repro. |
It also happened on our dev instances: https://h2oai.slack.com/archives/C068QB11XV4/p1702298998164059 |
Now it's happening on cloud-qa too https://h2oai.slack.com/archives/G01C9KKQLAC/p1704455231835909 |
Seeing this in 23.10.0 testing as well |
@codyharris-h2o-ai what app? |
Just FYI The debug version of wave has been deployed both in MC and in cloud-qa. |
@codyharris-h2o-ai If you see it during release testing maybe you could use this image instead: "gcr.io/vorvan/h2oai/mlops-wave-app-standalone:0.62.1-resourcefix-debugpanic" Just for debug purposes. It shouldn't be released as part of the release. @mturoci kindle implemented some additional logic to help with debugging. |
@mturoci the mlops wave ui I'm not sure how often we're running into this |
Another occurrence of this on internal.dedicated https://h2oai.slack.com/archives/C8MA5HGUU/p1708600075172279 |
@dulajra, which version of Wave? We have seen positive results using 1.0.2 |
Closed in #2246. Feel free to reopen if appears on the recent versions. |
Wave SDK Version, OS
0.26.2, Kubernetes (Managed Cloud)
Actual behavior
A wave app crashed but for some reason the container stayed up. This caused outage for at least one customer.
A very similar panic happened at least once before: #1949
The text was updated successfully, but these errors were encountered: