-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Troubleshooting Controller Freezes #10211
Comments
This issue is currently awaiting triage. If Ingress contributors determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/remove-kind bug You can look at other info like |
I'm getting this
|
Are you making changes to the dns settings? It looks like there changes with 1.1.1.1 in the settings? Do you have more than 3 in the config? Ingress-nginx uses whatever settings are available in the cluster. There is a known issue of tyring to apply more than 3 https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/#known-issues
|
Thank you for the tips @strongjz; I changed the DNS settings, and now I've got this in my /etc/resolv.conf file: /etc/resolv.conf
However, my ingress doesn't seem to work, and I'm not sure why. I also noticed that I have a self-signed certificate. Could that be interfering with the ingress functionality? My logs
|
Those logs look fine for a controllerstartup. Can you be more precise with what's not working? |
@strongjz, I apologize for the late response. I had to wait for the error to reproduce. After a certain duration, it appears that nginx is not performing its duties properly. The pod is running but there are no insights in the logs. I'm unable to see the GET request of a pod (tested using curl or a browser, from both inside and outside the cluster). For your information, I'm using flannel, metallb, and cert-manager with my self-signed certificate. Here are the outcomes of my traceroute and curl tests: Traceroute From outside the cluster:
From inside the cluster:
Curl From outside the cluster:
From inside the cluster:
this give me the following logs:
However, the browser connection from outside the cluster is still problematic. Strange I conducted tests on two different machines and initially faced issues. However, upon retrying on a new machine (unknow to the server), I received an HTTP 200 status code, indicating a successful request. Alongside this, I also obtained some logs, which are provided below:
What could be the cause of the issues ? Any assistance would be much appreciated. |
I've observed an intriguing behavior within the pods. It seems that the resolution of the host test.labo.bi fails intermittently. Here's what I experienced when executing the curl command repeatedly:
The "Could not resolve host" error occurs frequently, but occasionally, the request does succeed, indicating that the service is functional at times. This intermittent behavior is puzzling. Does anyone know why ???? |
The ingress controller doesn't seem to be the problem, but I'm leaning towards an issue with CoreDNS. Given this, I'm closing the issue. |
What happened:
The controller appears to crash sporadically, with the duration between crashes varying between a day and a week. During these crashes, the controller stops functioning entirely: it ceases to produce logs and fails to route. Restarting the controller usually remedies the issue; however, on some occasions, I need to reset the entire cluster to restore functionality.
What you expected to happen:
The controller should consistently and properly route to the service.
NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):
Kubernetes version :
Environment:
Cloud provider or hardware configuration: on premise
OS (e.g. from /etc/os-release): almalinux 8.8
Kernel (e.g.
uname -a
): Linux master 4.18.0-477.13.1.el8_8.x86_64 Basic structure #1 SMP Tue May 30 14:53:41 EDT 2023 x86_64 x86_64 x86_64 GNU/LinuxInstall tools:
kubeadm
flannel
metallb
longhorn
ingress nginx
Basic cluster related info:
kubectl version
kubectl get nodes -o wide
The controller was installed via the following commands for both cloud and baremetal configurations:
or
then change the service to the LoadBalancer
kubectl describe ingressclasses
The Ingresses
The text was updated successfully, but these errors were encountered: