-
Notifications
You must be signed in to change notification settings - Fork 879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KServe and cert-manager webhooks are failing #2660
Comments
Can you try with the master branch as well? Please also check whether your install command is up to date in the master branch readme.md and follow the installation instructions with Kind as close as possible. |
I was able to resolve this by increasing the resources allocated to the machine. Was getting capped out by CPU, maybe you're facing similar? |
Hey @juliusvonkohout, yes my local machine's master branch is up to date. |
@dnapier Hi, I tried to increase CPU resources in the --kubeconfig file but it says there is no resources field in v1alpha4.Node. Could you please tell me what you tried? |
When I ran I encountered another issue following this which was the activator of knative-serving crashing, but I do not believe that is related to the error you're seeing here. |
CC @diegolovison then |
Are you using kind with docker ? |
Hello guys, I'm facing the same issues. I have to deploy Kubeflow for an Internship project and I have the same problem with Kubeflow v1.8 After : "while ! kustomize build example | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done" I get this error Capture d'écran 2024-04-09 151931 My Kubernetes cluster is running with Tanzu. |
Please just test with Kind as explained in the readme.md in the master branch, to make sure that it is not a Kubernetes issue of your own cluster. |
Sorry, I didn't catch that this was addressed to me. Yes in my case, I am using kind with docker. Debian 12 host. |
What is the amount of CPU and memory that you have available? |
12GB of memory on the system, 8 core processor (Intel(R) Xeon(R) E5-2620). And yes I was strictly following the installation instructions. |
I already tested the v1.8 on minikube and I'm facing the same issue... |
I believe you will need to have more resources. I have 20 cores and 36GB of memory
I wasn't able to make it work on Minikube. Only with kind |
I've just attempted to install it using a local kind cluster, but it didn't work. I'm encountering another issue... |
That's the exact issue I'm facing which @diegolovison is suggesting is caused from lack of available resources. I'm working on doubling my memory to 24GB to test if that resolves it. Will update asap. |
Interesting.... I managed to install v1.8 on Minikube just now. I'm curious why it's working now. My suspicion is that I might encounter issues installing it on my Tanzu Cluster, perhaps due to a cluster-related problem. |
Do you mind sharing your cpu/memory for comparison? |
8 Cores/16G |
minikube with podman worked for me with 16 GB if you strip down the example distribution down a bit. Otherwise you might need 32 GB. @diegolovison , we should add the memory and core requirements on top of the installation instructions with kind. |
Do you believe that 32 GB and 20 cores? |
I do not understand your question. |
should we document that 32 GB of RAM and 20 CPU cores are the minimal to install Kubeflow locally? |
Not that I have a say here, but I think that's a great idea. |
I would go with 16 cores and 32 GB memory as recommendation. Or are you sure that 16 cores are not enough? It is possible to do with way less, but that is then left up to the end user. |
Ok. Sounds good |
@biswajit-9776 Please retry with the lastest master branch and readme. If you still encounter problems please open a new issue with our new template. |
While isntalling Kubeflow using the command:
while ! kustomize build example | awk '!/well-defined/' | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done
Some webhooks could not be reached:
The K-serve webhook issue was previously encountered in #2553. Should changes made in #2627 prevent reproducing this error? As for cert-manager webhook, #2585 had problem with no route to host while mine has with refused connection. It could be a kubernetes root level issue or deeper networking stack issue as in https://cert-manager.io/docs/troubleshooting/webhook/#cause-2-eks-on-a-custom-cni
kustomize version:
v5.3.0
My kubectl pods are:
The text was updated successfully, but these errors were encountered: