Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calico 3.8 fails to install under CoreOS stable #2712

Closed
bmcustodio opened this issue Jul 9, 2019 · 6 comments
Closed

Calico 3.8 fails to install under CoreOS stable #2712

bmcustodio opened this issue Jul 9, 2019 · 6 comments
Assignees
Labels

Comments

@bmcustodio
Copy link

bmcustodio commented Jul 9, 2019

Expected Behavior

Running kubectl apply -f https://docs.projectcalico.org/v3.8/manifests/calico.yaml eventually results in a calico-node-X pod in the Running status.

Current Behavior

Running kubectl apply -f https://docs.projectcalico.org/v3.8/manifests/calico.yaml results in calico-node-X stuck forever in the Init:0/3 status.

Possible Solution

I got the following output from kubectl describe:

$ kubectl -n kube-system describe pod calico-node-r76lz
(...)
Events:
  Type     Reason       Age                  From               Message
  ----     ------       ----                 ----               -------
  Normal   Scheduled    2m32s                default-scheduler  Successfully assigned kube-system/calico-node-r76lz to node-01
  Warning  FailedMount  29s                  kubelet, node-01   Unable to mount volumes for pod "calico-node-r76lz_kube-system(fba1f97a-64ff-4f98-b4c3-dadcd011469e)": timeout expired waiting for volumes to attach or mount for pod "kube-system"/"calico-node-r76lz". l
ist of unmounted volumes=[flexvol-driver-host]. list of unattached volumes=[lib-modules var-run-calico var-lib-calico xtables-lock cni-bin-dir cni-net-dir host-local-net-dir policysync flexvol-driver-host calico-node-token-h5czw]
  Warning  FailedMount  25s (x9 over 2m32s)  kubelet, node-01   MountVolume.SetUp failed for volume "flexvol-driver-host" : mkdir /usr/libexec/kubernetes: read-only file system

This makes me think that the installation manifest should maybe use a different path to install the FlexVolume driver?

(...)
        - name: flexvol-driver-host
          hostPath:
            type: DirectoryOrCreate
            path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds
(...)

Steps to Reproduce (for bugs)

  1. Spin up a machine running CoreOS stable.
  2. Run kubeadm init --pod-network-cidr=192.168.0.0/16.
  3. Configure kubectl according to the instructions.
  4. Run kubectl apply -f https://docs.projectcalico.org/v3.8/manifests/calico.yaml.

Context

I was trying to deploy Calico 3.8 in a brand new (bare-metal) Kubernetes 1.15.0 cluster bootstrapped by Kubeadm.

I am not entirely sure, but this may be related to #2699 somehow, even though Calico 3.7 installs without any issues.

Your Environment

  • Calico version: 3.8
  • Orchestrator version (e.g. kubernetes, mesos, rkt): Kubernetes 1.15.0.
  • Operating System and version: Container Linux by CoreOS stable (2135.5.0)
  • Link to your project (optional): N/A
@mmack
Copy link

mmack commented Jul 10, 2019

I fixed it by setting the correct flexvolume dir:

        # Used to install Flex Volume Driver
        - name: flexvol-driver-host
          hostPath:
            type: DirectoryOrCreate
            path: /var/lib/kubelet/volumeplugins/nodeagent~uds

It needs to be a dir inside the kubelet setting:
--volume-plugin-dir=/var/lib/kubelet/volumeplugins

The default path is not writable on coreos.

@rafaelvanoni
Copy link
Contributor

@bmcstdio Does @mmack's workaround solved your issue?

@bmcustodio
Copy link
Author

@rafaelvanoni yes, it does solve it. However, it requires (as @mmack mentions) that the --volume-plugin-dir (and also --flex-volume-plugin-dir in kube-controller-manager, I believe) is set to a good value. I am not sure this is mentioned in the documentation, but I believe it should. It also means that installing Calico 3.8 is no longer as easy as running kubectl apply -f (...) because the manifest has to be tweaked. 😔

@caseydavenport
Copy link
Member

The default path is not writable on coreos.

What's the proposed solution here? I think we have a conflict where Calico wants to enable this feature by default, but the default that Kubernetes uses isn't compatible with CoreOS (and probably some other environments as well).

I think our options are:

  • Turn this off by default in Calico, making configuration for this harder for people who want to use ALP features. This isn't really desirable.
  • Document how to switch this to something other than the default (no longer a one-click install for some users, also not desirable)
  • ???

I don't know of a way we could auto-detect this and magic the problem away. Does anyone have any ideas?

@rafaelvanoni
Copy link
Contributor

In the meantime, we've documented this issue and workaround in
https://docs.projectcalico.org/v3.10/reference/faq#are-the-calico-manifests-compatible-with-coreos

@caseydavenport
Copy link
Member

I'm not sure there's anything more we can do besides document this, so going to close this for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants