Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm can't pull coredns:v1.8.6 from registry.k8s.io #2761

Closed
deric opened this issue Sep 23, 2022 · 9 comments
Closed

kubeadm can't pull coredns:v1.8.6 from registry.k8s.io #2761

deric opened this issue Sep 23, 2022 · 9 comments
Labels
kind/support Categorizes issue or PR as a support question.

Comments

@deric
Copy link

deric commented Sep 23, 2022

BUG REPORT

kubeadm can't upgrade v1.24.3 -> v1.24.6

$ kubeadm upgrade apply v1.24.6 
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade/version] You have chosen to change the cluster version to "v1.24.6"
[upgrade/versions] Cluster version: v1.24.3
[upgrade/versions] kubeadm version: v1.24.6
[upgrade/confirm] Are you sure you want to proceed with the upgrade? [y/N]: y
[upgrade/prepull] Pulling images required for setting up a Kubernetes cluster
[upgrade/prepull] This might take a minute or two, depending on the speed of your internet connection
[upgrade/prepull] You can also perform this action in beforehand using 'kubeadm config images pull'
[preflight] Some fatal errors occurred:
        [ERROR ImagePull]: failed to pull image registry.k8s.io/coredns:v1.8.6: output: E0923 12:55:18.964141 1518584 remote_image.go:242] "PullImage from image service failed" err="rpc error: code = NotFound desc = failed to pull and unpack image \"registry.k8s.io/coredns:v1.8.6\": failed to resolve reference \"registry.k8s.io/coredns:v1.8.6\": registry.k8s.io/coredns:v1.8.6: not found" image="registry.k8s.io/coredns:v1.8.6"
time="2022-09-23T12:55:18Z" level=fatal msg="pulling image: rpc error: code = NotFound desc = failed to pull and unpack image \"registry.k8s.io/coredns:v1.8.6\": failed to resolve reference \"registry.k8s.io/coredns:v1.8.6\": registry.k8s.io/coredns:v1.8.6: not found"
, error: exit status 1

Versions

kubeadm version (use kubeadm version):

Environment:

  • Kubernetes version (use kubectl version): v1.24.3
  • Cloud provider or hardware configuration: bare metal
  • OS (e.g. from /etc/os-release): Debian 11
  • Kernel (e.g. uname -a): 5.10.0-17
  • Container runtime (CRI) (e.g. containerd, cri-o): containerd
  • Container networking plugin (CNI) (e.g. Calico, Cilium): flannel

What happened?

Failed to pull image "registry.k8s.io/coredns:v1.8.6".

What you expected to happen?

Download the image or use local cache.

How to reproduce it (as minimally and precisely as possible)?

Pulling images works fine

$ kubeadm config images pull
I0923 12:56:08.485224 1518651 version.go:255] remote version is much newer: v1.25.2; falling back to: stable-1.24
[config/images] Pulled k8s.gcr.io/kube-apiserver:v1.24.6
[config/images] Pulled k8s.gcr.io/kube-controller-manager:v1.24.6
[config/images] Pulled k8s.gcr.io/kube-scheduler:v1.24.6
[config/images] Pulled k8s.gcr.io/kube-proxy:v1.24.6
[config/images] Pulled k8s.gcr.io/pause:3.7
[config/images] Pulled k8s.gcr.io/etcd:3.5.3-0
[config/images] Pulled k8s.gcr.io/coredns/coredns:v1.8.6

However it's pulling from k8s.gcr.io while the "real" upgrade is using registry.k8s.io.

When registry.k8s.io is specified, the coredns:v1.8.6 image is missing.

$ kubeadm config images pull --image-repository=registry.k8s.io
I0923 12:57:22.827585 1518788 version.go:255] remote version is much newer: v1.25.2; falling back to: stable-1.24
[config/images] Pulled registry.k8s.io/kube-apiserver:v1.24.6
[config/images] Pulled registry.k8s.io/kube-controller-manager:v1.24.6
[config/images] Pulled registry.k8s.io/kube-scheduler:v1.24.6
[config/images] Pulled registry.k8s.io/kube-proxy:v1.24.6
[config/images] Pulled registry.k8s.io/pause:3.7
[config/images] Pulled registry.k8s.io/etcd:3.5.3-0
failed to pull image "registry.k8s.io/coredns:v1.8.6": output: E0923 12:57:26.772228 1518852 remote_image.go:242] "PullImage from image service failed" err="rpc error: code = NotFound desc = failed to pull and unpack image \"registry.k8s.io/coredns:v1.8.6\": failed to resolve reference \"registry.k8s.io/coredns:v1.8.6\": registry.k8s.io/coredns:v1.8.6: not found" image="registry.k8s.io/coredns:v1.8.6"
time="2022-09-23T12:57:26Z" level=fatal msg="pulling image: rpc error: code = NotFound desc = failed to pull and unpack image \"registry.k8s.io/coredns:v1.8.6\": failed to resolve reference \"registry.k8s.io/coredns:v1.8.6\": registry.k8s.io/coredns:v1.8.6: not found"
, error: exit status 1

Anything else we need to know?

Seems to be related to #2671.

@neolit123
Copy link
Member

neolit123 commented Sep 23, 2022

When registry.k8s.io is specified

older k8s versions have a default of gcr.l8s.io
if you pass a different registry url (like registry.k8s.io) the coredns path is diffrent (by design)
https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/#custom-images

during upgrade the registry.k8s.io migration will be handled for you.

@neolit123 neolit123 added the kind/support Categorizes issue or PR as a support question. label Sep 23, 2022
@deric
Copy link
Author

deric commented Sep 26, 2022

@neolit123 That would be great, indeed. Except it's the registry migration is not handled at all.

kubeadm upgrade apply v1.24.6 --v=5
I0926 07:33:39.550177 1903880 apply.go:104] [upgrade/apply] verifying health of cluster
I0926 07:33:39.550222 1903880 apply.go:105] [upgrade/apply] retrieving configuration from cluster
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
I0926 07:33:39.584600 1903880 kubelet.go:92] attempting to download the KubeletConfiguration from the new format location (UnversionedKubeletConfigMap=true)
I0926 07:33:39.604649 1903880 common.go:165] running preflight checks
[preflight] Running pre-flight checks.
I0926 07:33:39.604688 1903880 preflight.go:77] validating if there are any unsupported CoreDNS plugins in the Corefile
I0926 07:33:39.613121 1903880 preflight.go:105] validating if migration can be done for the current CoreDNS release.
[upgrade] Running cluster health checks
I0926 07:33:39.626638 1903880 health.go:162] Creating Job "upgrade-health-check" in the namespace "kube-system"
I0926 07:33:39.647401 1903880 health.go:192] Job "upgrade-health-check" in the namespace "kube-system" is not yet complete, retrying
I0926 07:33:40.651965 1903880 health.go:192] Job "upgrade-health-check" in the namespace "kube-system" is not yet complete, retrying
I0926 07:33:41.652311 1903880 health.go:192] Job "upgrade-health-check" in the namespace "kube-system" is not yet complete, retrying
I0926 07:33:42.653937 1903880 health.go:192] Job "upgrade-health-check" in the namespace "kube-system" is not yet complete, retrying
I0926 07:33:43.652061 1903880 health.go:192] Job "upgrade-health-check" in the namespace "kube-system" is not yet complete, retrying
I0926 07:33:44.653015 1903880 health.go:199] Job "upgrade-health-check" in the namespace "kube-system" completed
I0926 07:33:44.653056 1903880 health.go:205] Deleting Job "upgrade-health-check" in the namespace "kube-system"
I0926 07:33:44.672390 1903880 apply.go:112] [upgrade/apply] validating requested and actual version
I0926 07:33:44.672427 1903880 apply.go:128] [upgrade/version] enforcing version skew policies
[upgrade/version] You have chosen to change the cluster version to "v1.24.6"
[upgrade/versions] Cluster version: v1.24.3
[upgrade/versions] kubeadm version: v1.24.6
[upgrade/confirm] Are you sure you want to proceed with the upgrade? [y/N]: y
[upgrade/prepull] Pulling images required for setting up a Kubernetes cluster
[upgrade/prepull] This might take a minute or two, depending on the speed of your internet connection
[upgrade/prepull] You can also perform this action in beforehand using 'kubeadm config images pull'
I0926 07:33:46.508442 1903880 checks.go:834] using image pull policy: IfNotPresent
I0926 07:33:46.532087 1903880 checks.go:843] image exists: registry.k8s.io/kube-apiserver:v1.24.6
I0926 07:33:46.554428 1903880 checks.go:843] image exists: registry.k8s.io/kube-controller-manager:v1.24.6
I0926 07:33:46.575329 1903880 checks.go:843] image exists: registry.k8s.io/kube-scheduler:v1.24.6
I0926 07:33:46.597428 1903880 checks.go:843] image exists: registry.k8s.io/kube-proxy:v1.24.6
I0926 07:33:46.623373 1903880 checks.go:843] image exists: registry.k8s.io/pause:3.7
I0926 07:33:46.643634 1903880 checks.go:851] pulling: registry.k8s.io/coredns:v1.8.6
[preflight] Some fatal errors occurred:
        [ERROR ImagePull]: failed to pull image registry.k8s.io/coredns:v1.8.6: output: E0926 07:33:48.359458 1903951 remote_image.go:242] "PullImage from image service failed" err="rpc error: code = NotFound desc = failed to pull and unpack image \"registry.k8s.io/coredns:v1.8.6\": failed to resolve reference \"registry.k8s.io/coredns:v1.8.6\": registry.k8s.io/coredns:v1.8.6: not found" image="registry.k8s.io/coredns:v1.8.6"
time="2022-09-26T07:33:48Z" level=fatal msg="pulling image: rpc error: code = NotFound desc = failed to pull and unpack image \"registry.k8s.io/coredns:v1.8.6\": failed to resolve reference \"registry.k8s.io/coredns:v1.8.6\": registry.k8s.io/coredns:v1.8.6: not found"
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`

Or the recommended way is to ignore the preflight errors?

@deric
Copy link
Author

deric commented Sep 26, 2022

@neolit123 I guess the real issue is that the new registry registry.k8s.io is missing the image coredns:v1.8.6. while the old registry k8s.gcr.io works just fine.

@neolit123
Copy link
Member

neolit123 commented Sep 26, 2022

what are the contents of:
kubectl get cm kubeadm-config -n kube-system

did you manually edit this cm?

I guess the real issue is that the new registry registry.k8s.io is missing the image coredns:v1.8.6. while the old registry k8s.gcr.io works just fine.

we have upgrade ci that is passing.

all coredns images should be under a /coredns path.
also registry.k8s.io is just a redirect to the old registry, currently.

the key here is that custom registry, different from the one hardcoded in the kubeadm binary would result in a path without /coredns. i mentioned this above.

@neolit123
Copy link
Member

if you have edited the registry to registry.k8s.io manually the cm please revert it to k8s.gcr.io and try upgrade again.

the 1.25 upgrade will do the migration for you.
changes are linked in #2671

@deric
Copy link
Author

deric commented Sep 26, 2022

@neolit123

$ kubectl get cm kubeadm-config -n kube-system
NAME             DATA   AGE
kubeadm-config   1      41d

I'm not upgrading to v1.25 yet, as the description says it's upgrade from v1.24.3 -> v1.24.6.

The config.yaml contains old registry:

$ cat /etc/kubernetes/config.yaml | grep imageRepo
imageRepository:  k8s.gcr.io

However the image repository seems to be already upgraded, I'm not sure how that happened (I haven't done that manually).

$ kubectl -n kube-system get cm kubeadm-config -o yaml | grep imageRepository
    imageRepository: registry.k8s.io

@pacoxu
Copy link
Member

pacoxu commented Sep 26, 2022

  • /etc/kubernetes/config.yaml is not a traditional kubeadm configuration file. (You may init the cluster with it. After that, you should check the configmap to see the latest configuration.)
  • ConfigMapkubeadm-config will not be edited with kubeadm v1.24-; Only v1.25.x kubeadm will edit the configmap during upgradation.

@neolit123
Copy link
Member

please use kubectl edit cm... to change the cm field to k8s.gcr.io

@deric
Copy link
Author

deric commented Sep 26, 2022

I've reverted the imageRepository back to k8s.gcr.io

kubectl -n kube-system edit cm kubeadm-config

and now the upgrade seems to be working.

weizhouapache added a commit to weizhouapache/cluster-api-provider-cloudstack that referenced this issue May 8, 2023
with registry.k8s.io, the control plane vm cannot be booted to Ready state due to error below

ubuntu@disk-offering-gzojvb-control-plane-gwbnt:~$ tail -f /var/log/cloud-init-output.log
[2023-05-08 11:09:23] [preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[2023-05-08 11:09:28] error execution phase preflight: [preflight] Some fatal errors occurred:
[2023-05-08 11:09:28] 	[ERROR ImagePull]: failed to pull image registry.k8s.io/coredns:v1.8.4: output: time="2023-05-08T11:09:28Z" level=fatal msg="pulling image: rpc error: code = NotFound desc = failed to pull and unpack image \"registry.k8s.io/coredns:v1.8.4\": failed to resolve reference \"registry.k8s.io/coredns:v1.8.4\": registry.k8s.io/coredns:v1.8.4: not found"
[2023-05-08 11:09:28] , error: exit status 1

this is same as kubernetes/kubeadm#2761
The new registry should be supported in k8s 1.25+. However, we still use 1.22/1.23/1.24 templates, so we need to use k8s.gcr.io

This change can be reverted when we upgrade to k8s 1.25+
weizhouapache added a commit to weizhouapache/cluster-api-provider-cloudstack that referenced this issue May 10, 2023
The image repository has been changed to registry.k8s.io by commit 8c1e614

However, with registry.k8s.io, the control plane vm cannot be booted to Ready state due to error below
```
    ubuntu@disk-offering-gzojvb-control-plane-gwbnt:~$ tail -f /var/log/cloud-init-output.log
    [2023-05-08 11:09:23] [preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
    [2023-05-08 11:09:28] error execution phase preflight: [preflight] Some fatal errors occurred:
    [2023-05-08 11:09:28]   [ERROR ImagePull]: failed to pull image registry.k8s.io/coredns:v1.8.4: output: time="2023-05-08T11:09:28Z" level=fatal msg="pulling image: rpc error: code = NotFound desc = failed to pull and unpack image \"registry.k8s.io/coredns:v1.8.4\": failed to resolve reference \"registry.k8s.io/coredns:v1.8.4\": registry.k8s.io/coredns:v1.8.4: not found"
    [2023-05-08 11:09:28] , error: exit status 1
```

this is same as kubernetes/kubeadm#2761
The new registry should be supported in k8s 1.25+. However, we still use 1.22/1.23/1.24 templates, so we need to use k8s.gcr.io

setting the image respository to "" so that capi/kubeadm will determine the default repository by kubernetes version.
hrak pushed a commit to hrak/cluster-api-provider-cloudstack that referenced this issue May 15, 2023
The image repository has been changed to registry.k8s.io by commit 8c1e614

However, with registry.k8s.io, the control plane vm cannot be booted to Ready state due to error below
```
    ubuntu@disk-offering-gzojvb-control-plane-gwbnt:~$ tail -f /var/log/cloud-init-output.log
    [2023-05-08 11:09:23] [preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
    [2023-05-08 11:09:28] error execution phase preflight: [preflight] Some fatal errors occurred:
    [2023-05-08 11:09:28]   [ERROR ImagePull]: failed to pull image registry.k8s.io/coredns:v1.8.4: output: time="2023-05-08T11:09:28Z" level=fatal msg="pulling image: rpc error: code = NotFound desc = failed to pull and unpack image \"registry.k8s.io/coredns:v1.8.4\": failed to resolve reference \"registry.k8s.io/coredns:v1.8.4\": registry.k8s.io/coredns:v1.8.4: not found"
    [2023-05-08 11:09:28] , error: exit status 1
```

this is same as kubernetes/kubeadm#2761
The new registry should be supported in k8s 1.25+. However, we still use 1.22/1.23/1.24 templates, so we need to use k8s.gcr.io

setting the image respository to "" so that capi/kubeadm will determine the default repository by kubernetes version.
hrak pushed a commit to hrak/cluster-api-provider-cloudstack that referenced this issue May 15, 2023
The image repository has been changed to registry.k8s.io by commit 8c1e614

However, with registry.k8s.io, the control plane vm cannot be booted to Ready state due to error below
```
    ubuntu@disk-offering-gzojvb-control-plane-gwbnt:~$ tail -f /var/log/cloud-init-output.log
    [2023-05-08 11:09:23] [preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
    [2023-05-08 11:09:28] error execution phase preflight: [preflight] Some fatal errors occurred:
    [2023-05-08 11:09:28]   [ERROR ImagePull]: failed to pull image registry.k8s.io/coredns:v1.8.4: output: time="2023-05-08T11:09:28Z" level=fatal msg="pulling image: rpc error: code = NotFound desc = failed to pull and unpack image \"registry.k8s.io/coredns:v1.8.4\": failed to resolve reference \"registry.k8s.io/coredns:v1.8.4\": registry.k8s.io/coredns:v1.8.4: not found"
    [2023-05-08 11:09:28] , error: exit status 1
```

this is same as kubernetes/kubeadm#2761
The new registry should be supported in k8s 1.25+. However, we still use 1.22/1.23/1.24 templates, so we need to use k8s.gcr.io

setting the image respository to "" so that capi/kubeadm will determine the default repository by kubernetes version.
vignesh-goutham pushed a commit to vignesh-goutham/cluster-api-provider-cloudstack that referenced this issue Jul 12, 2023
The image repository has been changed to registry.k8s.io by commit 8c1e614

However, with registry.k8s.io, the control plane vm cannot be booted to Ready state due to error below
```
    ubuntu@disk-offering-gzojvb-control-plane-gwbnt:~$ tail -f /var/log/cloud-init-output.log
    [2023-05-08 11:09:23] [preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
    [2023-05-08 11:09:28] error execution phase preflight: [preflight] Some fatal errors occurred:
    [2023-05-08 11:09:28]   [ERROR ImagePull]: failed to pull image registry.k8s.io/coredns:v1.8.4: output: time="2023-05-08T11:09:28Z" level=fatal msg="pulling image: rpc error: code = NotFound desc = failed to pull and unpack image \"registry.k8s.io/coredns:v1.8.4\": failed to resolve reference \"registry.k8s.io/coredns:v1.8.4\": registry.k8s.io/coredns:v1.8.4: not found"
    [2023-05-08 11:09:28] , error: exit status 1
```

this is same as kubernetes/kubeadm#2761
The new registry should be supported in k8s 1.25+. However, we still use 1.22/1.23/1.24 templates, so we need to use k8s.gcr.io

setting the image respository to "" so that capi/kubeadm will determine the default repository by kubernetes version.
vignesh-goutham pushed a commit to vignesh-goutham/cluster-api-provider-cloudstack that referenced this issue Jul 12, 2023
The image repository has been changed to registry.k8s.io by commit 8c1e614

However, with registry.k8s.io, the control plane vm cannot be booted to Ready state due to error below
```
    ubuntu@disk-offering-gzojvb-control-plane-gwbnt:~$ tail -f /var/log/cloud-init-output.log
    [2023-05-08 11:09:23] [preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
    [2023-05-08 11:09:28] error execution phase preflight: [preflight] Some fatal errors occurred:
    [2023-05-08 11:09:28]   [ERROR ImagePull]: failed to pull image registry.k8s.io/coredns:v1.8.4: output: time="2023-05-08T11:09:28Z" level=fatal msg="pulling image: rpc error: code = NotFound desc = failed to pull and unpack image \"registry.k8s.io/coredns:v1.8.4\": failed to resolve reference \"registry.k8s.io/coredns:v1.8.4\": registry.k8s.io/coredns:v1.8.4: not found"
    [2023-05-08 11:09:28] , error: exit status 1
```

this is same as kubernetes/kubeadm#2761
The new registry should be supported in k8s 1.25+. However, we still use 1.22/1.23/1.24 templates, so we need to use k8s.gcr.io

setting the image respository to "" so that capi/kubeadm will determine the default repository by kubernetes version.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question.
Projects
None yet
Development

No branches or pull requests

3 participants