Experiences installing OKD 4.8 on OpenStack #949
Replies: 7 comments
-
From the access solution:
So this is a kubernetes problem, it needs to be either fixed in-tree - or fixed by deploying CSI drivers only. |
Beta Was this translation helpful? Give feedback.
-
You don't need kubeconfig to troubleshoot the cluster - all the info you need is collected in a log bundle |
Beta Was this translation helpful? Give feedback.
-
OKD has no support, so clearly we can't list which versions would be supported or compatible. From OCP results we can list versions which are known to work |
Beta Was this translation helpful? Give feedback.
-
Sounds like a documentation bug |
Beta Was this translation helpful? Give feedback.
-
File an OCP bug for installer |
Beta Was this translation helpful? Give feedback.
-
File an OCP bug for installer to improve wording |
Beta Was this translation helpful? Give feedback.
-
File documentation bug to have it stated in the docs |
Beta Was this translation helpful? Give feedback.
-
Hello! I recently performed an IPI install of OKD on OpenStack, and while the overall process was quite smooth and I ended up with a fully working cluster, I ran into some roadblocks along the way. When I asked in the OKD Working Group meeting, they suggested I first write this up as a summary in a Discussion here, and then we can figure out how best break it up into issues or other work for future improvements. Many of these are papercuts, but a few seemed like legitimate bugs or gaps in documentation.
Once we decide in discussion here the best way to proceed with these, I'm happy to write up individual tickets or whatever else makes sense. Just let me know.
Thank you for the awesomeness that is OKD!
A. The cluster-image-registry operator stuck in DEGRADED
During the install step where the installer waits for all of the operators to start, the cluster-image-registry operator was stuck in a DEGRADED state. Listing logs for the image-registry pods gave the following error message:
This is apparently a known problem in OCP documented in https://access.redhat.com/solutions/6020241 .
We were able to get past this by following the instructions in https://docs.okd.io/4.8/storage/container_storage_interface/persistent-storage-csi-cinder.html to switch the default csi from kubernetes.io/cinder to cinder.csi.openstack.org , and then deleting and recreating the
openshift-image-registry
PersistentVolumeClaim withmetadata.annotations.volume.beta.kubernetes.io/storage-provisioner: cinder.csi.openstack.org
. After that, the operator healed itself and was able to come up successfully.This ends up raising a number of points:
B. When the installer failed, it wasn't clear how to connect to the cluster to troubleshoot
When the installer succeeds, it gives a helpful message pointing to the generated kubeconfig on local disk for connecting to the cluster. When it fails (at least if it fails late enough that there's a cluster), a kubeconfig file is still available, but it doesn't tell the user about it. It would be nice if it made this explicit in its messages.
For what it's worth, the docs do say that the kubeconfig is available during installation https://docs.okd.io/4.8/installing/installing_openstack/installing-openstack-installer-custom.html#installation-osp-verifying-cluster-status_installing-openstack-installer-custom
C. Difficult to figure out which OpenStack versions are compatible
The docs only mention support for Red Hat OpenStack Platform versions 13 and 16. This leads to several questions:
D. Default machineNetwork overlapped with my workstation's subnet
I got a cluster running, set up a load balancer, and couldn't get a response when trying to access the cluster console or API. It turns that that the IP subnet my workstation is on overlaps the default subnet used for
networking.machineNetwork
. This meant that OpenStack Neutron's NAT couldn't correctly route packets from the machineNetwork back to my workstation.This was, admittedly, my fault for misunderstanding how Neutron NAT works.
However, it was also tricky to troubleshoot, and there are a couple of things we could do to make this better:
E. Mysterious warning about default Docker Bridge subnet
After I worked through D and changed the IP range for the machineSubnet, the installer gave the following warning:
It wasn't clear to me why this is a warning or what impact that would have. OKD uses CRI-O, not Docker, so I guessed it was vestigial, and indeed the cluster comes up fine.
F. Hard-to-read error message for missing password in clouds.yaml
I generated a clouds.yaml using OpenStack's Horizon web UI to point the installer to my OpenStack and project. I failed to follow the "IMPORTANT" message in step 1 of https://docs.okd.io/4.8/installing/installing_openstack/installing-openstack-installer-custom.html#installation-osp-describing-cloud-parameters_installing-openstack-installer-custom and therefore forgot to add a password to the auth field. While this was 100% my fault, the error message I got was a bit hard to understand:
In particular, it would have been helpful if that error message indicated that it was an OpenStack password that was necessary, or mentioned the clouds.yaml.
G. Can Octavia load balancers for Ingress be configured without using Kuryr?
My initial read of the differences between https://docs.okd.io/4.8/installing/installing_openstack/installing-openstack-installer-custom.html#installing-openstack-installer-custom and https://docs.okd.io/4.8/installing/installing_openstack/installing-openstack-installer-kuryr.html#installing-openstack-installer-kuryr led me to believe that without Kuryr, I would need to create my own Octavia load balancer after the installation.
This is the route I ended up using, creating my own floating IPs before installation, and then after installation, creating my own load balancer for ingress, adding the external ingress floating IP to it and pointing its pool at the worker nodes.
However, I notice that the docs at the
latest
version https://docs.okd.io/latest/installing/installing_openstack/installing-openstack-installer-custom.html#installing-openstack-installer-custom include a cloud.conf section that appears to create a load balancer for me, but the4.8
version does not https://docs.okd.io/4.8/installing/installing_openstack/installing-openstack-installer-custom.html#installing-openstack-installer-custom.When I tried cloud.conf instructions, they didn't appear to actually create a load balancer, but it's possible I misunderstood how to write the cloud.conf .
Beta Was this translation helpful? Give feedback.
All reactions