Proxy issue during 4.12 upgrade #1481

llomgui · 2023-01-25T08:54:13Z

Hello,

During an 4.12 upgrade I had an issue with worker upgrade (fedora 36 to fedora 37).

The first worker was stuck, so I check journalctl -f.
I saw the following log: Txn Rebase on /org/projectatomic/rpmostree1/fedora_coreos failed: Failed to invoke skopeo proxy method OpenImage: remote error: pinging container registry quay.io: Get "[https://quay.io/v2/":](https://quay.io/v2/%22:) dial tcp 54.163.152.191:443: i/o timeout

The cluster is behind a company proxy. So it should not try to get the package directly.

On some workers, the solution was to create this file:

sudo vi /etc/systemd/system/rpm-ostreed.service.d/http-proxy.conf

[Service]
Environment="http_proxy=PROXY_URL"

sudo systemctl daemon-reload
sudo systemctl restart rpm-ostreed.service

But on some others, I did'nt have to do anything. It worked without any issue.

I created a poc cluster before updating this cluster, with the same version 4.11.0-0.okd-2023-01-14-152430.
To make sure the upgrade to 4.12 is working on GCP. I didn't get any issue.

The text was updated successfully, but these errors were encountered:

vrutkovs · 2023-01-25T09:39:43Z

But on some others, I did'nt have to do anything. It worked without any issue.

Did other nodes got this configuration settings after successful reboot? I'd expect MCO to apply proxy settings, but it looks like there's a race applying those

llomgui · 2023-01-25T09:43:26Z

Did other nodes got this configuration settings after successful reboot? I'd expect MCO to apply proxy settings, but it looks like there's a race applying those

No, on "without issue" workers, I don't have this file after sucessful reboot.

tyronewilsonfh · 2023-01-25T10:42:15Z

Experienced this issue on 5 nodes in a 10 node test cluster (same MCP), running the rebase command manually whilst setting upper and lowercase http/https proxy env's would sometimes show a message of pulling manifest then timeout with the above error, most attempts would just timeout with the same message.

Upgrading another cluster in same environment didn't have these issues.

vrutkovs · 2023-01-25T10:47:09Z

Sounds indeed like an MCO race. Please report this to https://issues.redhat.com/browse/OCPBUGS, component "Machine Config Operator" with a must-gather please.

vinisman · 2023-03-29T05:47:55Z

We also have okd behind the proxy and this approach helps us to update to 4.12 version. @llomgui thank you.
We had such kind an error:
E0329 05:46:02.424632 1461945 writer.go:200] Marking Degraded due to: failed to update OS to quay.io/openshift/okd-content@sha256:125e94f63520330aa50e85bcfd55e429d3051498c9ff3936628edd0d4ea5696b : error running rpm-ostree rebase --experimental ostree-unverified-registry:quay.io/openshift/okd-content@sha256:125e94f63520330aa50e85bcfd55e429d3051498c9ff3936628edd0d4ea5696b: error: remote error: (Mirrors also failed: [nexus.dev.mycompany.com:60002/okd-mirror/okd@sha256:125e94f63520330aa50e85bcfd55e429d3051498c9ff3936628edd0d4ea5696b: reading manifest sha256:125e94f63520330aa50e85bcfd55e429d3051498c9ff3936628edd0d4ea5696b in nexus.dev.mycompany.com:60002/okd-mirror/okd: manifest unknown: manifest unknown] 715 [nexus.dev.mycompany.com:60022/okd-mirror/okd@sha256:125e94f63520330aa50e85bcfd55e429d3051498c9ff3936628edd0d4ea5696b: reading manifest sha256:125e94f63520330aa50e85bcfd55e429d3051498c9ff3936628edd0d4ea5696b in nexus.dev.mycompany.com:60022/okd-mirror/okd: manifest unknown: manifest unknown]): quay.io/openshift/okd-content@sha256:125e94f63520330aa50e85bcfd55e429d3051498c9ff3936628edd0d4ea5696b: pinging container registry quay.io: Get "https://quay.io/v2/": dial tcp: lookup quay.io: no such host 716 : exit status 1

danielchristianschroeter · 2023-05-24T06:34:38Z

I ran into the same error situation when upgrading from 4.11.0-0.okd-2023-01-14-152430 to 4.12.0-0.okd-2023-04-16-041331.
In journalctl -b -u rpm-ostreed you see those issues:
Txn Rebase on /org/projectatomic/rpmostree1/fedora_coreos failed: Failed to invoke skopeo proxy method OpenImage: remote error: pinging container registry quay.io: Get "https://quay.io/v2/": dial tcp 3.87.166.194:443: i/o timeout

You see also crashing coredns- and keepalived- pods on effected node. I executed this one liner via SSH on all nodes:
echo -e "[Service]\nEnvironment=\"https_proxy=http://<proxy>:3128\"" | sudo tee /etc/systemd/system/rpm-ostreed.service.d/http-proxy.conf >/dev/null && sudo systemctl daemon-reload && sudo systemctl restart rpm-ostreed.service

Please note, you can only create http-proxy.conf during an upgrade process, otherwise the directory /etc/systemd/system/rpm-ostreed.service.d/ does not exists on the node. I'm not sure if you can place this file also before starting the upgrade.

vrutkovs · 2024-02-23T07:48:43Z

Caused by ostreedev/ostree-rs-ext#582, fixed in rpm-ostree 2024.2. openshift/okd-machine-os#751 should include it, but in order to update to it in disconnected env you'd need a workaround (see previous comment)

llomgui · 2024-02-25T09:00:37Z

The previous comment workaround won't work with the latest 4.15 version.
The only way is to create /usr/local/bin/skopeo with the following:

#! /bin/bash

if [ $(systemctl whoami) = "rpm-ostreed.service" ]; then
	export http_proxy=http://MY_PROXY:3128
	export https_proxy=http://MY_PROXY:3128
fi

/usr/bin/skopeo "$@"

Credit to ostreedev/ostree-rs-ext#582 (comment)

vrutkovs · 2024-02-25T17:42:55Z

@llomgui @danielchristianschroeter could you check if https://github.com/okd-project/okd/releases/tag/4.15.0-0.okd-2024-02-23-163410 sets correct proxy vars for rpm-ostreed?

llomgui · 2024-03-04T08:48:56Z

@vrutkovs It doesn't work, I still have to use the workaround above.
I will try with another cluster jumping directly from 2024-01-27-070424 to 2024-02-23-163410.

vrutkovs · 2024-03-04T09:12:27Z

Doesn't work on clean install or upgrade?

Vins88 · 2024-03-12T23:01:13Z

Hello,
I experienced the same problem when I upgraded from 4.15.0-0.okd-2024-02-10-035534 to 4.15.0-0.okd-2024-03-10-010116
I can confirm that this procedure correctly applied the proxy env and allowed the upgrade to be completed:
ostreedev/ostree-rs-ext#582 (comment)

thanks for sharing

vrutkovs closed this as completed Mar 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proxy issue during 4.12 upgrade #1481

Proxy issue during 4.12 upgrade #1481

llomgui commented Jan 25, 2023

vrutkovs commented Jan 25, 2023

llomgui commented Jan 25, 2023

tyronewilsonfh commented Jan 25, 2023

vrutkovs commented Jan 25, 2023

vinisman commented Mar 29, 2023 •

edited

Loading

danielchristianschroeter commented May 24, 2023 •

edited

Loading

vrutkovs commented Feb 23, 2024

llomgui commented Feb 25, 2024

vrutkovs commented Feb 25, 2024

llomgui commented Mar 4, 2024

vrutkovs commented Mar 4, 2024

Vins88 commented Mar 12, 2024

Proxy issue during 4.12 upgrade #1481

Proxy issue during 4.12 upgrade #1481

Comments

llomgui commented Jan 25, 2023

vrutkovs commented Jan 25, 2023

llomgui commented Jan 25, 2023

tyronewilsonfh commented Jan 25, 2023

vrutkovs commented Jan 25, 2023

vinisman commented Mar 29, 2023 • edited Loading

danielchristianschroeter commented May 24, 2023 • edited Loading

vrutkovs commented Feb 23, 2024

llomgui commented Feb 25, 2024

vrutkovs commented Feb 25, 2024

llomgui commented Mar 4, 2024

vrutkovs commented Mar 4, 2024

Vins88 commented Mar 12, 2024

vinisman commented Mar 29, 2023 •

edited

Loading

danielchristianschroeter commented May 24, 2023 •

edited

Loading