Hostname resolution via reverse DNS lookups broken in OKD 4.7/4.8 #648

gudroot · 2021-05-25T22:13:27Z

Describe the bug
I installed a 4.8 RC2 cluster (vsphere UPI with static ips via ignition). The nodes came up for the first time with their correct hostname from DHCP4 but after the reboot their hostname was 'localhost'. I then reinstalled the cluster and provided an ignition snippet that sets /etc/hostname and the cluster is now running more or less fine. This was different before, i.e. I didn't have to set /etc/hostname with releases up to and including 4.7.

Version
4.8.0-0.okd-2021-05-22-053824 on vsphere UPI

How reproducible
Always without the hostname ignition snippet.

Log bundle
If you really need a log bundle I can reinstall the cluster and create one on thursday. I however doubt it is really necessary as the issue appears to happen before the installer does much of its job.

vrutkovs · 2021-05-27T10:03:37Z

I then reinstalled the cluster and provided an ignition snippet that sets /etc/hostname

I don't think its a good idea. /etc/hostname may be overwritten by NetworkManager.
If its configured as DHCP it expects to have received the hostname via DHCP too.

How was initial hostname passed to the cluster - via DHCP/kernel args/custom ignition?

gudroot · 2021-05-27T13:57:57Z

FCOS docs state this as a method to set the hostname (https://docs.fedoraproject.org/en-US/fedora-coreos/hostname/), so I thought it might be ok to try. Also /usr/local/sbin/set-valid-hostname.sh stays that /etc/hostname is authoritative. In my case the hostname is in DHCP and DNS and since I am in control of both of them they can be considered pretty static.

But don't get me wrong, I don't insist on manually setting the hostname if there are other means to get a valid hostname. It was just a check if the installation would run through if the issue (every node is 'localhost' after the first reboot') was put aside. It did.

I now installed the cluster without static ips and got a valid hostname at first boot and later. It was taken from DHCP4 in both cases and the 'transient hostname' was set. /usr/local/sbin/set-valid-hostname.sh was happy and the cluster nodes are up.

gudroot · 2021-06-02T15:11:21Z

I experimented a bit with dhcp and static ips. My goal is to have nodes with ipv4 and ipv6 on the primary interface and ipv4 static ips without dhcp on the second interface that is used for storage. Please note that I didn't install a full cluster every time, so I don't know if the first experiments would have resulted in a working cluster.

dhcp on the primary interface (ipv4 only) and static ip on the second: This seems to work, I have seen mentions of this on slack, but I wanted dual stack hosts, so I didn't try it.
dhcp and dhcp6 for the primary interface via dracut cmdline rd.neednet=1 ip=dhcp,dhcp6 rd.net.timeout.dhcp=3 rd.net.timeout.ipv6dad=3 results in a primary interface with an ipv4 config from dhcp4 and nothing more than a link local address in ipv6. dhcp6 is never even attempted. same result with other combinations of dhcp and dhcp6 and without additional parameters.
dhcp for ipv4 and static ip for ipv6. this is not how this is meant to work, i suppose. I get the ipv6 address and no ipv4 besides 127.0.0.1 on the lo interface.
Static ips for everything like in for example
ip=192.168.9.44::192.168.9.33:255.255.255.224:worker4.example.com:ens192:off:192.168.1.11:192.168.1.12 ip=[fd3f:58c7:4:D:192:168:9:44]::[fd3f:58c7:4:D::1]:64:worker4.example.com:ens192:off ip=10.0.8.44:::24::ens224:off: This seems to be doing what it is supposed to but results in a different problem. The search domain cannot be set via dracut cmdline (https://bugzilla.redhat.com/show_bug.cgi?id=1963882) and is set to . (a single dot) which is then somehow reduced to nothing in the name-resolver pods deployed by dns operator. A resolv.conf with an empty search domain makes dig unhappy and the internal registry unresolvable.
Static ips like in 3 but set the same ips and the search domain via ignition. This seems to work but creates redundancy.

ATM I have a running cluster that was installed with method 4 including dual stack hosts and ips for the storage network.

vrutkovs · 2021-06-02T16:51:00Z

Static ips like in 3 but set the same ips and the search domain via ignition. This seems to work but creates redundancy.

Right, that's the only feasible option at the moment. If initially the interface can be configured via DHCP, then the whole static IP configuration (along with search domain) can be set in NM keyfiles.

This however won't work on all setups - but we depend on search domain configuration in dracut to close the feature gap

kai-uwe-rommel · 2021-06-04T11:34:14Z

I have installed already a number of clusters with static IPs. At the beginning (before Afterburn/dracut became available in FCOS), I modified the boot images (initial kernel arguments) for the initial boot and via ignition set the config via NM config file for ens192. When Afterburn was available I first only used it for the initial boot IP config and stayed with NM config files in addition tho this.

After a number of tests I found that Afterburn config worked so well that I decided to use only Afterburn Ip config and no longer set NM config files. I did also notice that I then have no way to specify the search suffix any longer. But I have not found yet that this causes problems. @gudroot, are you sure this causes problems with the internal registry? Perhaps we do not use this enough to even notice? How would I notice this?

@vrutkovs, with some recent OKD release there was also the change that after cluster deployment, the IP config is no longer on the ens192 interface but on br-ex. The Afterburn IP config ends up in a NM config file named default_connection.nmconnection so I would have to somehow modify this to add dns-search or already pre-create this via ignition, I assume.

vrutkovs · 2021-06-04T12:37:46Z

with some recent OKD release

We didn't do OKD 4.8 releases yet, lets keep this ticket on topic - and file a new one if the change is needed

kai-uwe-rommel · 2021-06-04T16:41:37Z

@vrutkovs I was just meaning that what @gudroot filed here is probably not new/specific with 4.8.

kai-uwe-rommel · 2021-06-06T10:34:40Z

I can confirm that the problem that @gudroot describes about the internal registry not being resolvable when "dns-search=." is not specific to 4.8 ... It already happens at least with 4.7-2021-06-04 as well. I am pretty sure it did not happen with earlier releases. But I will verify this with another deployment/test cycle later today or tomorrow. Once I have more details I will create another issue and reference this one.

gudroot · 2021-06-07T07:30:45Z

@kai-uwe-rommel you seem to have found out already: to show the search domain problem simply run an image from the local registry like oc run mariadb --image=image-registry.openshift-image-registry.svc:5000/openshift/mariadb:latest. The pod will never come up because 'image-registry.openshift-image-registry.svc' can't be found.

kai-uwe-rommel · 2021-06-07T07:33:49Z

And I have meanwhile found out, that the problem already appears with OKD 4.7 2021-06-04 as well, not only with 4.8.
The problem does not appear with OKD 4.7 2021-05-22 and older versions.
I guess that the FCOS stream has been updated and that a newer NetworkManager is causing the problem.
When no search domain is specified with dracut/Afterburn (since there is no support for this), then in the automatically generated NetworkManager config files, in the [IPv4] section there is only "dns-search=", e.g. empty.
With the new version of OKD/FCOS/NetworkManager, this leads to "search=." in resolv.conf.
With the previous versions, there is no "search=..." line in resolv.conf at all.
@vrutkovs, do you still think I should open a new issue for this?

vrutkovs · 2021-06-07T07:38:39Z

With the new version of OKD/FCOS/NetworkManager, this leads to "search=." in resolv.conf.

What was the behaviour in previous versions? I don't think empty DNS domain is ever a correct situation.

Edit: I updated to 2021-06-04 and NM didn't append search record here, so perhaps it affects new installs only?

kai-uwe-rommel · 2021-06-07T07:55:04Z

We are not talking about the DNS domain. When you specify a FQDN via dracut/Afterburn as the hostname, then the domain part of it is used as the DNS domain. The problem is the search domain (or search suffix list).

kai-uwe-rommel · 2021-06-07T08:07:07Z

As a workaround I re-enabled the code in my deployment scripts that also creates NetworkManager config files via ignition in addition to the Afterburn vSphere config string. This for now solves the problem and I again have a working search domain suffix. I create a default.nmconnection referring to ens192 and the OKD setup picks up the data from it correctly for the br-ex config file it generates.

(I had stopped generating NetworkManager config files as dracut/Afterburn worked so well. And a search suffix was not a problem so far - but now with the value of "." it is a problem).

vrutkovs · 2021-06-07T08:29:55Z

My bad, I meant search domain. I don't think it ever got set previously - and I don't see why would that affect the deployment - cluster.local would be appended to search domain list anyway. If nodes have FQDN hostnames (as they must) empty (or .) search domain should be equivalent and play no role in address resolution

kai-uwe-rommel · 2021-06-07T08:52:45Z

Well that's the theory. :-) In practice, we see this difference now.
The cluster deployment is not affected.
But once the cluster is up and you try to deply something that refers to an image in the clusters internal registry, it fails.
Example: I always deploy a cron job to each cluster for LDAP group synchronization.
It uses this image:

Containers:
  ldap-group-sync:
    Image:         image-registry.openshift-image-registry.svc:5000/openshift/cli

With the new release now that fails:

Events:
  Type     Reason          Age                 From               Message
  ----     ------          ----                ----               -------
  Normal   Scheduled       73m                 default-scheduler  Successfully assigned openshift-authentication/custom-ldap-group-sync-1622970000-g2cwh to worker-02.kur-test.ars.de
  Normal   AddedInterface  73m                 multus             Add eth0 [10.131.0.122/23]
  Normal   Pulling         71m (x4 over 73m)   kubelet            Pulling image "image-registry.openshift-image-registry.svc:5000/openshift/cli"
  Warning  Failed          71m (x4 over 73m)   kubelet            Failed to pull image "image-registry.openshift-image-registry.svc:5000/openshift/cli": rpc error: code = Unknown desc = error pinging docker registry image-registry.openshift-image-registry.svc:5000: Get "https://image-registry.openshift-image-registry.svc:5000/v2/": dial tcp: lookup image-registry.openshift-image-registry.svc on 193.149.36.118:53: no such host
  Warning  Failed          71m (x4 over 73m)   kubelet            Error: ErrImagePull
  Warning  Failed          63m (x42 over 73m)  kubelet            Error: ImagePullBackOff
  Normal   BackOff         58m (x63 over 73m)  kubelet            Back-off pulling image "image-registry.openshift-image-registry.svc:5000/openshift/cli"

With all previous releases this was working fine.
This is the same that @gudroot reported above.
The problem goes away as soon as I add a sensible dns-search=... to the NetworkManager config file(s) and reboot the nodes.

gudroot · 2021-06-07T10:06:14Z

in my opinion the problem is in the node-resolver pods. Since I don't have the broken cluster any more, I'll have to recall this from memory:
Not setting the search domain results in a search domain of "." and a resolv.conf with the content

search .
nameserver <ip1>
nameserver <ip2>

on the host.
This is then somehow mangled and the node-resolver pods get a resolv.conf that is slightly different:

search
nameserver <ip1>
nameserver <ip2>

I am not sure a search domain "." offers any advantages over not having a search domain at all, but having an empty search domain is IMHO syntactically incorrect. It reminds me of this old bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=669163 Maybe dig and nslookup are not the only tools that stumble upon that kind of error in resolv.conf.
For the exact error messages I would have to reinstall the cluster.
@kai-uwe-rommel do you still have your broken cluster?

kai-uwe-rommel · 2021-06-07T10:37:52Z

No, I don't have the "broken" cluster available at the moment because I already tested/implemented the workaround.
But I can deploy a new "broken" cluster easily with about 10 minutes of work, if needed.

vrutkovs · 2021-06-20T07:24:03Z

Perhaps its similar to #690, could you check if this still happens on OKD 4.8 RC3?

kai-uwe-rommel · 2021-06-20T19:20:28Z

@vrutkovs, I think these are two different problems. The #690 issue talks about a missing /etc/systemd/resolved.conf.d directory. This case here is (also) about the "search ." problem.

I don't have time to test OKD 4.8 RC versions at the moment but I do always test new 4.7 releases as I need them for my projects. So I just installed a 4.7 2021-06-19 cluster and still see the "search ." problem that this issue here is (also) about. But the /etc/systemd/resolved.conf.d directory does exist (#690 also is about 4.7).

So the primary topic of #690 is not our problem here. But in #690 @bobby0724 and @chrisu001 were later hijacking the issue a bit also for the "search ." problem which is apparently unrelated to @fortinj66's /etc/systemd/resolved.conf.d problem.

vrutkovs · 2021-06-20T21:15:03Z

I was hoping working resolved.conf configuration would resolve (or at least alleviate) this issue, but alas.

Looks like there isn't anything to fix in OKD right now. It appears the preferred ways to configure this is:

via kernel args - blocked by dracut feature request implementation
via NM config file, placed by MC on every node - not sure if NM allows dropins for connection files.
Perhaps we could implement this via a service which runs nmcli early in the boot?

chrisu001 · 2021-06-21T05:14:12Z

I can confirm, that the bug search . still exists in

● pivot://registry.ci.openshift.org/origin/4.8-2021-06-19-221544@sha256:3e274ce09a9d5cb5a1fadd3927f7f068e7d3b75cf362b16745f7706e52bd7dcb
              CustomOrigin: Managed by machine-config-operator
                   Version: 48.34.202106191727-0 (2021-06-19T17:31:30Z)

  ostree://fedora:fedora/x86_64/coreos/stable
                   Version: 34.20210529.3.0 (2021-06-14T14:45:28Z)
                    Commit: 5040eaabed46962a07b1e918ba5afa1502e1f898bf958673519cd83e986c228f
              GPGSignature: Valid signature by 8C5BA6990BDB26E19F2A1A801161AE6945719A39

node resolv.conf:

search .
nameserver 192.168.51.1

node-resolver pod

$ oc exec  -n openshift-dns node-resolver-55gtc -- cat /etc/resolv.conf
search
nameserver 192.168.51.1

so the node-resolver pod still fails

$oc logs  -n openshift-dns node-resolver-55gtc
dig: parse of /etc/resolv.conf failed
...

Hint: this is an automated test-setup based on UPI Virtualboxes with static ip/kernelargs
setting a search domain will circumvent the issue

gudroot · 2021-06-21T07:21:44Z

I was hoping working resolved.conf configuration would resolve (or at least alleviate) this issue, but alas.

Are we talking about the search domain issue or the hostname issue now? IMO the hostname issue affects every dual stack cluster, since you cannot use both dhcp4 and dhcp6 at the same time (I tried it and only got IPv4), but you can't use static ips via afterburn either as this would trigger the search domain issue.

Last time I checked dual stack was on the roadmap for 4.8, so if it still is, there should be some kind of idea on how to set this up correctly.

Looks like there isn't anything to fix in OKD right now. It appears the preferred ways to configure this is:

via kernel args - blocked by dracut feature request implementation

via NM config file, placed by MC on every node - not sure if NM allows dropins for connection files.
Perhaps we could implement this via a service which runs nmcli early in the boot?

This would be a temporary fix for the search domain issue until dracut is capable of configuring a search domain?

vrutkovs · 2021-06-21T09:54:33Z

#698 (comment) - perhaps its SELinux preventing the hostname to be set?

kai-uwe-rommel · 2021-06-21T11:20:07Z

Currently I'm solving the "search=..." problem already with by adding a small NetworkManager file during ignition (to keep things together, static IP config also happens in this stage).
I do not have a problem with that but it was quite a learning curve to determine all this and create my own meta installer for UPI deployment on vSphere with static IPs. At least some documentation would be helpful for others that will come the same way later.

gudroot · 2021-06-21T15:33:33Z

The search=dot problem seems to be more common than I thought: #694

gudroot · 2021-06-23T08:11:33Z

#698 (comment) - perhaps its SELinux preventing the hostname to be set?

I can see the same selinux messages in the logs captured on a cluster that is affected by the hostname=localhost problem that was the subject of this issue before we started to discuss overly short search domains.

I'll try another installation when selinux-policy 34.11-1.fc34 hits 4.8

fortinj66 · 2021-06-23T10:19:30Z

@gudroot, how are you assigning hostnames? via DHCP or reverse DNS lookup?

edit: Nevermind, I see it is by DHCP...

kai-uwe-rommel · 2021-06-23T10:24:02Z

@fortinj66 you did ask gudroot and not me but I would still like to add my $0.02 ...
I have learned that if I want reliable clusters in all configurations (e.g. DHCP or fixed IPs) I better explicitly set the hostname via a statement in the ignition file to configure the /etc/hostname file.
I'm always using UPI on vSphere, though. This gives a lot more freedom.

fortinj66 · 2021-06-23T10:29:11Z

@fortinj66 you did ask gudroot and not me but I would still like to add my $0.02 ...
I have learned that if I want reliable clusters in all configurations (e.g. DHCP or fixed IPs) I better explicitly set the hostname via a statement in the ignition file to configure the /etc/hostname file.
I'm always using UPI on vSphere, though. This gives a lot more freedom.

But you shouldn't have too... FCOS should be able to resolve the hostnames either by DHCP or reverse lookup.

reverse lookup seems to be completely broken in FCOS 34. I'm going to test DHCP assignment later this morning.

kai-uwe-rommel · 2021-06-23T10:42:48Z

In theory, many things ought to work much better than they actually do.
Not always do I spend much time to insist on this and work with the vendor to fix things.
In some cases I simply find a good and easy solution for myself and spend my time on more important things...

vrutkovs · 2021-06-25T10:45:17Z

Are we going to have to go back to FCOS 33?

No, luckily NM team has COPR, so we can use 1.26 or 1.32 builds - or try 1.30.2 from fedora stable repo (latest from updates is .4). See PRs to implement these above.

I mirrored CI test releases to

quay.io/vrutkovs/okd-release:4.7-nm-1.32
quay.io/vrutkovs/okd-release:4.7-nm-1.30.2
quay.io/vrutkovs/okd-release:4.7-nm-1.26

Please give these a try.

fortinj66 · 2021-06-25T19:18:05Z

I've tested all three versions...

search . is still broken for static IPs on all three versions
reverse lookup for hostname works on 1.26 and 1.32. Broken on 1.30.2

fortinj66 · 2021-06-25T21:13:51Z

So, It seems the issue with search domains with static IPs in FCOS 34 is systemd-resolved related:

static int write_uplink_resolv_conf_contents(FILE *f, OrderedSet *dns, OrderedSet *domains) {

        fputs("# This is "PRIVATE_UPLINK_RESOLV_CONF" managed by man:systemd-resolved(8).\n"
              "# Do not edit.\n"
              "#\n"
              "# This file might be symlinked as /etc/resolv.conf. If you're looking at\n"
              "# /etc/resolv.conf and seeing this text, you have followed the symlink.\n"
              "#\n"
              "# This is a dynamic resolv.conf file for connecting local clients directly to\n"
              "# all known uplink DNS servers. This file lists all configured search domains.\n"
              "#\n"
              "# Third party programs should typically not access this file directly, but only\n"
              "# through the symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a\n"
              "# different way, replace this symlink by a static file or a different symlink.\n"
              "#\n"
              "# See man:systemd-resolved.service(8) for details about the supported modes of\n"
              "# operation for /etc/resolv.conf.\n"
              "\n", f);

        if (ordered_set_isempty(dns))
                fputs("# No DNS servers known.\n", f);
        else {
                unsigned count = 0;
                DnsServer *s;

                ORDERED_SET_FOREACH(s, dns)
                        write_resolv_conf_server(s, f, &count);
        }

        if (ordered_set_isempty(domains))
                fputs("search .\n", f); /* Make sure that if the local hostname is chosen as fqdn this does not
                                         * imply a search domain */
        else
                write_resolv_conf_search(domains, f);

        return fflush_and_check(f);
}

This line:
fputs("search .\n", f); / Make sure that if the local hostname is chosen as fqdn this does not
* imply a search domain /

fortinj66 · 2021-06-25T21:55:22Z

The equivalent in FCOS 33:

        if (!ordered_set_isempty(domains))
                write_resolv_conf_search(domains, f);
        return fflush_and_check(f);

No search line is written if there are no domains

kai-uwe-rommel · 2021-06-26T06:00:03Z

Actually, I'd even regard it as smart if in the case there is no search domain specified explicitly, it would use the domain suffix from the hostname if that was specified as a FQDN ...

vrutkovs · 2021-06-26T09:23:07Z

Seems it was implemented by systemd/systemd#17201, so we might want to revert systemd-* to v246

vrutkovs · 2021-06-26T09:27:49Z

Also, this bug is like million comments long, anyone could summarize which problems we're hitting, which are the workaround and which packages need updated/downgraded?

IIUC its two bugs:

reverse DNS hostname is broken in NM 1.30+, fixed in 1.32 (we can fasttrack it)
systemd-resolved no longer allows containers to use image-registry.openshift-image-registry.svc:5000 image. This requires reverting systemd to v246, which is very risky.
Do we have a workaround for that? Does .svc.cluster.local work?

fortinj66 · 2021-06-26T12:21:21Z

Do we have a workaround for that? Does .svc.cluster.local work?

We could workaround this in the prepender code... Just add a line which removes "search ." Issue would be if systemd-resolved was run again it would recreate the issue
A service that gets triggered when systemd-resolved is run...

edit: actually, prepender wont work as it doesn't exist for UPI installs....

fortinj66 · 2021-06-27T14:30:44Z

2. A service that gets triggered when systemd-resolved is run...

openshift/okd-machine-os#156

LorbusChris · 2021-06-28T03:18:21Z

Someone will have to file this upstream with FCOS and resolved. Any fixup commits will have to contain a proper justification and links to the filed issues.

fortinj66 · 2021-06-28T10:30:10Z

@kai-uwe-rommel @bobby0724 @gudroot

would you folks be able to test: quay.io/fortinj66/origin-release:v4.7-search-fix

Its based off the quay.io/openshift/okd:4.7.0-0.okd-2021-06-19-191547 stable release...

It has a workaround for the search . issue.

fortinj66 · 2021-06-28T10:31:31Z

Someone will have to file this upstream with FCOS and resolved. Any fixup commits will have to contain a proper justification and links to the filed issues.

I want to make sure the 'fix' works for folks other than me before I go through the rest of the hoops...

kai-uwe-rommel · 2021-06-28T10:39:41Z

Will do tonight.

kai-uwe-rommel · 2021-06-28T20:22:23Z

Seems to be fixed. No search line in resolve.conf at all (when configuring static IP purely via Afterburn, e.g. no search domain specified).

fortinj66 · 2021-06-28T21:46:03Z

Seems to be fixed. No search line in resolve.conf at all (when configuring static IP purely via Afterburn, e.g. no search domain specified).

Were you able to deploy new Deployments and pods? I was able to with my testing...

kai-uwe-rommel · 2021-06-29T06:12:01Z

Yes, looks all normal.

fortinj66 · 2021-06-29T17:19:27Z

fixes for search . and hostname via reverse DNS should be in the latest nightlies starting with nightly registry.ci.openshift.org/origin/release:4.7.0-0.okd-2021-06-29-103107

bobby0724 · 2021-06-30T02:02:41Z

thanks for fixing this issue

vrutkovs · 2021-07-03T21:48:53Z

Should be fixed in https://amd64.origin.releases.ci.openshift.org/releasestream/4-stable/release/4.7.0-0.okd-2021-07-03-190901

johnlongo · 2021-07-16T21:55:54Z

I'm running 4.7.0-0.okd-2021-07-03-190901, but it still broken in my cluster

johnlongo · 2021-07-19T11:34:21Z

I followed the above conversation and wanted to know is there a work around for this issue? What should resolv.conf look like to work?

vrutkovs · 2021-07-19T11:40:52Z

What should resolv.conf look like to work?

Latest OKD now runs a service which removes search . from /etc/resolv.conf. Check that the file on the nodes doesn't contain it.

If you're hitting an issue with similar symptoms and search . is not present, please file a new bug

PyrekP · 2021-10-28T09:40:00Z

Hi, just installed 4.8.0-0.okd-2021-10-24-061736 with static IP's and still have search . there. So same as:

Static ips for everything like in for example
ip=192.168.9.44::192.168.9.33:255.255.255.224:worker4.example.com:ens192:off:192.168.1.11:192.168.1.12 ip=[fd3f:58c7:4:D:192:168:9:44]::[fd3f:58c7:4:D::1]:64:worker4.example.com:ens192:off ip=10.0.8.44:::24::ens224:off: This seems to be doing what it is supposed to but results in a different problem. The search domain cannot be set via dracut cmdline (https://bugzilla.redhat.com/show_bug.cgi?id=1963882) and is set to . (a single dot) which is then somehow reduced to nothing in the name-resolver pods deployed by dns operator. A resolv.conf with an empty search domain makes dig unhappy and the internal registry unresolvable.

How can I get rid of it?

kai-uwe-rommel · 2021-11-04T14:57:26Z

Hi, just installed 4.8.0-0.okd-2021-10-24-061736 with static IP's and still have search . there. So same as:

Static ips for everything like in for example
ip=192.168.9.44::192.168.9.33:255.255.255.224:worker4.example.com:ens192:off:192.168.1.11:192.168.1.12 ip=[fd3f:58c7:4:D:192:168:9:44]::[fd3f:58c7:4:D::1]:64:worker4.example.com:ens192:off ip=10.0.8.44:::24::ens224:off: This seems to be doing what it is supposed to but results in a different problem. The search domain cannot be set via dracut cmdline (https://bugzilla.redhat.com/show_bug.cgi?id=1963882) and is set to . (a single dot) which is then somehow reduced to nothing in the name-resolver pods deployed by dns operator. A resolv.conf with an empty search domain makes dig unhappy and the internal registry unresolvable.

How can I get rid of it?

I cannot reproduce this. I just installed a cluster with the same version and also static IPs and do not see this problem.

This was referenced Jun 25, 2021

WIP [release-4.7] Revert to NetworkManager 1.26 openshift/okd-machine-os#151

Closed

[release-4.7] manifest: fasttrack NetworkManager 1.32 openshift/okd-machine-os#152

Merged

WIP [release-4.7] manifest: revert to NM 1.30.2 openshift/okd-machine-os#153

Closed

vrutkovs closed this as completed Jul 3, 2021

vrutkovs mentioned this issue Jul 15, 2021

Just in stalled a fresh 4.7 cluster on vsphere 6.7 but have an odd problem #768

Closed

kai-uwe-rommel mentioned this issue Apr 15, 2022

Cannot DNS-resolve the internal cluster registry #1184

Closed

Hostname resolution via reverse DNS lookups broken in OKD 4.7/4.8 #648

Hostname resolution via reverse DNS lookups broken in OKD 4.7/4.8 #648

Comments

gudroot commented May 25, 2021

vrutkovs commented May 27, 2021

gudroot commented May 27, 2021

gudroot commented Jun 2, 2021

vrutkovs commented Jun 2, 2021

kai-uwe-rommel commented Jun 4, 2021 • edited Loading

vrutkovs commented Jun 4, 2021

kai-uwe-rommel commented Jun 4, 2021

kai-uwe-rommel commented Jun 6, 2021

gudroot commented Jun 7, 2021

kai-uwe-rommel commented Jun 7, 2021 • edited Loading

vrutkovs commented Jun 7, 2021 • edited Loading

kai-uwe-rommel commented Jun 7, 2021 • edited Loading

kai-uwe-rommel commented Jun 7, 2021

vrutkovs commented Jun 7, 2021 • edited Loading

kai-uwe-rommel commented Jun 7, 2021

gudroot commented Jun 7, 2021

kai-uwe-rommel commented Jun 7, 2021

vrutkovs commented Jun 20, 2021

kai-uwe-rommel commented Jun 20, 2021

vrutkovs commented Jun 20, 2021

chrisu001 commented Jun 21, 2021

gudroot commented Jun 21, 2021

vrutkovs commented Jun 21, 2021

kai-uwe-rommel commented Jun 21, 2021

gudroot commented Jun 21, 2021

gudroot commented Jun 23, 2021

fortinj66 commented Jun 23, 2021 • edited Loading

kai-uwe-rommel commented Jun 23, 2021

fortinj66 commented Jun 23, 2021

kai-uwe-rommel commented Jun 23, 2021

vrutkovs commented Jun 25, 2021

fortinj66 commented Jun 25, 2021

fortinj66 commented Jun 25, 2021

fortinj66 commented Jun 25, 2021

kai-uwe-rommel commented Jun 26, 2021

vrutkovs commented Jun 26, 2021

vrutkovs commented Jun 26, 2021

fortinj66 commented Jun 26, 2021 • edited Loading

fortinj66 commented Jun 27, 2021 • edited Loading

LorbusChris commented Jun 28, 2021

fortinj66 commented Jun 28, 2021

fortinj66 commented Jun 28, 2021

kai-uwe-rommel commented Jun 28, 2021

kai-uwe-rommel commented Jun 28, 2021

fortinj66 commented Jun 28, 2021

kai-uwe-rommel commented Jun 29, 2021

fortinj66 commented Jun 29, 2021

bobby0724 commented Jun 30, 2021

vrutkovs commented Jul 3, 2021

johnlongo commented Jul 16, 2021

johnlongo commented Jul 19, 2021

vrutkovs commented Jul 19, 2021

PyrekP commented Oct 28, 2021

kai-uwe-rommel commented Nov 4, 2021

kai-uwe-rommel commented Jun 4, 2021 •

edited

Loading

kai-uwe-rommel commented Jun 7, 2021 •

edited

Loading

vrutkovs commented Jun 7, 2021 •

edited

Loading

kai-uwe-rommel commented Jun 7, 2021 •

edited

Loading

vrutkovs commented Jun 7, 2021 •

edited

Loading

fortinj66 commented Jun 23, 2021 •

edited

Loading

fortinj66 commented Jun 26, 2021 •

edited

Loading

fortinj66 commented Jun 27, 2021 •

edited

Loading