-
Notifications
You must be signed in to change notification settings - Fork 931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker containers failing in /dev/.lxc/proc directory #2825
Comments
Hi.
|
Sorry, to clarify: I switched to a container where the docker installation wasn't broken |
@niko-daniel i'm not too familiar with the code yet unfortunately. Do you think this issue is more appropriate for the https://github.com/lxc/lxc repository? |
@CarltonSemple no, this is done using LXD so this repository is the right place for this report. |
Sounds like Docker is confused by the mount table in the LXD container. We'll have to take a look and may have to file an issue with Docker upstream for this. |
No. But it's trivial to reproduce:
This turns out to be with the docker upstream 1.13.0-rc1 package installed in the container, because that's what I happened to have lying around. Can try with newer or Ubuntu packaging but somehow doubt that's going to make a difference. |
On 30 Jan 2017 07:01, "Michael Hudson-Doyle" <[email protected]> wrote:
Can try with newer or Ubuntu packaging but somehow doubt that's going to
make a difference.
It doesn't.
|
Just so you know, I had another issue that is probably related. (in LXD container), Docker pull of some images doesn't complete: moby/moby#30569 |
Hi @stgraber, do you have any temporary workarounds, or a rough estimate of when this might be worked on? |
Or do you have any pointers on where to look so I could help? |
Hmm, so I'm not sure there's an easy workaround for this other than having Docker fixed to not use paths it can't access... /dev/.lxc/proc and /dev/.lxc/sys are paths mounted by LXC to allow for container nesting in environments where the kernel would deny mounting of new proc and sys instances due to the overmounting protection kicking in. Unmounting those would prevent spawning nested containers and allowing the read and writes under those paths would be a potential security issue. The correct fix here is to fix Docker with one of:
|
@mwhudson sounds like we may want an extra test for --privileged in our autopkgtest and a patch to fix this issue ^ |
@stgraber Agreed. I'm doing some investigating right now, building docker myself. The error comes when the daemon tries to read process information about itself, in /dev/.lxc/proc/<daemon_pid>. |
@CarltonSemple what's odd is that it only does so when --privileged is passed? I'd think that docker would access /proc/self at some other point too... Anyway, I still find it weird that they go through all the trouble of scanning /proc/mounts (or mountinfo) for all instances of /proc rather than just pick the standard path for it... |
@stgraber Yes. The function where the behavior changes is I was able to add ".lxc" to the skipped directories in the INFO[0001] Loading containers: done.
INFO[0001] Daemon has completed initialization
INFO[0001] Docker daemon commit=1564f02-unsupported graphdriver=overlay version=1.12.4
INFO[0001] API listen on /var/run/docker.sock
Privileged...
skipping .lxc
skipping .lxd-mounts
appending
appending
/dev / hugepages
skipping lxd
skipping mqueue
/dev / net
appending
appending
skipping pts
appending
skipping shm
appending
appending
appending
ERRO[0010] containerd: start container error=oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:291: setting cgroup config for ready process caused \"failed to write a *:* rwm to devices.allow: write /sys/fs/cgroup/devices/docker/997a8faece21427b2efe84c901978b178c3cbcb67da6787c2782ba23750c8a21/devices.allow: operation not permitted\""
id=997a8faece21427b2efe84c901978b178c3cbcb67da6787c2782ba23750c8a21
ERRO[0010] Create container failed with error: invalid header field value "oci runtime error: container_linux.go:247: starting container process caused \"process_linux.go:291: setting cgroup config for ready process caused \\\"failed to write a *:* rwm to devices.allow: write /sys/fs/cgroup/devices/docker/997a8faece21427b2efe84c901978b178c3cbcb67da6787c2782ba23750c8a21/devices.allow: operation not permitted\\\"\"\n"
ERRO[0010] Handler for POST /v1.24/containers/997a8faece21427b2efe84c901978b178c3cbcb67da6787c2782ba23750c8a21/start returned error: invalid header field value "oci runtime error: container_linux.go:247: starting container process caused \"process_linux.go:291: setting cgroup config for ready process caused \\\"failed to write a *:* rwm to devices.allow: write /sys/fs/cgroup/devices/docker/997a8faece21427b2efe84c901978b178c3cbcb67da6787c2782ba23750c8a21/devices.allow: operation not permitted\\\"\"\n" |
Ok, that one is expected too, you can't write to devices.allow or devices.deny from inside a user namespace. I'm guessing Docker already has logic to deal with this somewhere else, otherwise normal containers would fail too. |
It looks related to this older issue: https://bugs.launchpad.net/ubuntu/+source/lxd/+bug/1593301 |
Though that was only showing up when the LXD container itself was privileged, that shouldn't happen when the LXD container is unprivileged, unless Docker's --privileged bypasses whatever code was avoid that cgroup config. |
Gotcha. So I switched back to an unprivileged LXD container, and that error no longer shows up. root@docker2:~# ./dockerd-1.13.0
INFO[0000] libcontainerd: new containerd process, pid: 475
INFO[0001] [graphdriver] using prior storage driver: overlay
INFO[0001] Graph migration to content-addressability took 0.00 seconds
WARN[0001] Your kernel does not support swap memory limit.
WARN[0001] Your kernel does not support cgroup rt period
WARN[0001] Your kernel does not support cgroup rt runtime
INFO[0001] Loading containers: start.
WARN[0001] Running modprobe bridge br_netfilter failed with message: modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-59-generic/modules.dep.bin'
modprobe: WARNING: Module bridge not found in directory /lib/modules/4.4.0-59-generic
modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-59-generic/modules.dep.bin'
modprobe: WARNING: Module br_netfilter not found in directory /lib/modules/4.4.0-59-generic
, error: exit status 1
WARN[0001] Running modprobe nf_nat failed with message: `modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-59-generic/modules.dep.bin'
modprobe: WARNING: Module nf_nat not found in directory /lib/modules/4.4.0-59-generic`, error: exit status 1
WARN[0001] Running modprobe xt_conntrack failed with message: `modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-59-generic/modules.dep.bin'
modprobe: WARNING: Module xt_conntrack not found in directory /lib/modules/4.4.0-59-generic`, error: exit status 1
INFO[0001] Firewalld running: false
INFO[0001] Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address
INFO[0002] Loading containers: done.
INFO[0002] Daemon has completed initialization
INFO[0002] Docker daemon commit=49bf474-unsupported graphdriver=overlay version=1.13.0
INFO[0002] API listen on /var/run/docker.sock
WARN[0103] Your kernel does not support swap memory limit.
WARN[0103] Your kernel does not support cgroup rt period
WARN[0103] Your kernel does not support cgroup rt runtime
ERRO[0103] containerd: start container error=oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:252: getting pipe fds for pid 631 caused \"readlink /proc/631/fd/0: permission denied\""
id=b638a5975734e527a8c15fc24b5c6222c8ceced1414c7657a6a4362ebb5ed95c
ERRO[0103] Create container failed with error: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:252: getting pipe fds for pid 631 caused \"readlink /proc/631/fd/0: permission denied\""
ERRO[0103] containerd: deleting container error=exit status 1: "container b638a5975734e527a8c15fc24b5c6222c8ceced1414c7657a6a4362ebb5ed95c does not exist\none or more of the container deletions failed\n"
ERRO[0104] Handler for POST /v1.25/containers/b638a5975734e527a8c15fc24b5c6222c8ceced1414c7657a6a4362ebb5ed95c/start returned error: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:252: getting pipe fds for pid 631 caused \"readlink /proc/631/fd/0: permission denied\"" I'm assuming the module errors are okay, after reading https://github.com/lxc/lxd/issues/2321 What's weird is that the root user inside of the container can go into the /proc/$(pid)/fd directory, but doesn't seem to have access to all of the files.
|
That's indeed a bit weird. Can you try to ssh into the container or run docker through a "script /dev/null" session perhaps? I want to check if maybe the problem is to do with the pts device that you get from LXD when doing "lxc exec" (which coming from the host would be owned by the host and so may cause those permission issues). |
The extra test is trivial, as is adding a patch or two, once we have them... |
@stgraber you mean like this? I also tried building off of root@docker3:~# script /dev/null
Script started, file is /dev/null
# ./dockerd-1.13.0
INFO[0000] libcontainerd: new containerd process, pid: 2854
INFO[0001] [graphdriver] using prior storage driver: overlay
INFO[0001] Graph migration to content-addressability took 0.00 seconds
WARN[0001] Your kernel does not support swap memory limit.
WARN[0001] Your kernel does not support cgroup rt period
WARN[0001] Your kernel does not support cgroup rt runtime
INFO[0001] Loading containers: start.
WARN[0001] Running modprobe bridge br_netfilter failed with message: modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-59-generic/modules.dep.bin'
modprobe: WARNING: Module bridge not found in directory /lib/modules/4.4.0-59-generic
modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-59-generic/modules.dep.bin'
modprobe: WARNING: Module br_netfilter not found in directory /lib/modules/4.4.0-59-generic
, error: exit status 1
WARN[0001] Running modprobe nf_nat failed with message: `modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-59-generic/modules.dep.bin'
modprobe: WARNING: Module nf_nat not found in directory /lib/modules/4.4.0-59-generic`, error: exit status 1
WARN[0001] Running modprobe xt_conntrack failed with message: `modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-59-generic/modules.dep.bin'
modprobe: WARNING: Module xt_conntrack not found in directory /lib/modules/4.4.0-59-generic`, error: exit status 1
INFO[0001] Firewalld running: false
INFO[0001] Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address
INFO[0001] Loading containers: done.
INFO[0001] Daemon has completed initialization
INFO[0001] Docker daemon commit=49bf474-unsupported graphdriver=overlay version=1.13.0
INFO[0001] API listen on /var/run/docker.sock
WARN[0042] Your kernel does not support swap memory limit.
WARN[0042] Your kernel does not support cgroup rt period
WARN[0042] Your kernel does not support cgroup rt runtime
ERRO[0042] containerd: start container error=oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:252: getting pipe fds for pid 3002 caused \"readlink /proc/3002/fd/0: permission denied\""
id=96eee0b4ec4e56336f030bc34b881121f1c7c109c13bf2265fa166838a89cb06
ERRO[0042] Create container failed with error: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:252: getting pipe fds for pid 3002 caused \"readlink /proc/3002/fd/0: permission denied\""
ERRO[0042] containerd: deleting container error=exit status 1: "container 96eee0b4ec4e56336f030bc34b881121f1c7c109c13bf2265fa166838a89cb06 does not exist\none or more of the container deletions failed\n"
ERRO[0042] Handler for POST /v1.25/containers/96eee0b4ec4e56336f030bc34b881121f1c7c109c13bf2265fa166838a89cb06/start returned error: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:252: getting pipe fds for pid 3002 caused \"readlink /proc/3002/fd/0: permission denied\"" |
The error looks like it's coming from https://github.com/opencontainers/runc/blob/master/libcontainer/process_linux.go#L247 , and might be related to opencontainers/runc#1130 |
@stgraber is the code for the LXD version of Docker on Github anywhere? I want to make sure I'm not running into unrelated errors. |
@CarltonSemple I don't believe @mwhudson maintains it in git but you can go to https://launchpad.net/ubuntu/+source/docker.io and grab the debian.tar.xz tarball for the latest release there, you'll find a debian/patches directory which includes all the patches that Ubuntu's docker has on top of the clean upstream tarball. |
The packaging is maintained at
https://github.com/tianon/debian-docker/tree/ubuntu (sometimes
https://github.com/mwhudson/debian-docker/tree/ubuntu is more up to date
for a day or two) but that's not really the most comprehensible form if
you're not used to debian packaging already.
…On 14 February 2017 at 08:31, Stéphane Graber ***@***.***> wrote:
@CarltonSemple <https://github.com/CarltonSemple> I don't believe
@mwhudson <https://github.com/mwhudson> maintains it in git but you can
go to https://launchpad.net/ubuntu/+source/docker.io and grab the
debian.tar.xz tarball for the latest release there, you'll find a
debian/patches directory which includes all the patches that Ubuntu's
docker has on top of the clean upstream tarball.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<https://github.com/lxc/lxd/issues/2825#issuecomment-279496704>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AApBFr46lyOlZe_yvUxVErmnPN13T__Aks5rcK93gaJpZM4LvD7i>
.
|
Can we confirm that opencontainers/runc#1327 solves the original problem? In any of the docker.io packages at https://launchpad.net/ubuntu/+source/docker.io, the runC code of interest is in the vendor directory of docker. @mwhudson @stgraber |
With regards to this issue:
That's to be expected. Trying to |
@cyphar gotcha. I was able to get past that error though, using an update to runc. I now have Docker 1.13 unprivileged containers running. Have you been able to verify opencontainers/runc#1327 ? I am not familiar with debian packaging, so the best I could do was download the docker.orig.tar.gz and try building that. opencontainers/runc#1327 gets me past the I was hoping we could at least cover that error in this issue and open another issue for privileged Docker containers. |
opencontainers/runc#1327 looks like an okay change to me, and I'll LGTM it once I figure out why the CI broke (it's probably not because of your change). It should fix your issue with Docker, but I would recommend verifying that yourself (try replacing |
@cyphar CI failure is related to opencontainers/runc#1237 |
@cyphar yes, I included the opencontainers/runc#1327 change when building As for Using that, I was able to get past Now I have the error |
@stgraber does that last error look like more of an LXD issue? |
@CarltonSemple not sure, it says it's creating devnice nodes, but the error suggests it attempted to open it for read/write /dev/tty in LXD containers should work fine so I'm not sure why the one inside the docker container doesn't. |
@CarltonSemple I would very strongly recommend against using the line you linked -- it makes runC vulnerable to race conditions that allow processes to access the host (see CVE-2016-9962). I'm working on making this code safe and work properly -- please look at opencontainers/runc#774 and see if you can cherry-pick the changes to As for the |
@brauner Since you wrote the patch I would recommend reading the above. The stuff I mentioned at FOSDEM about |
This issue was noted in a LXC/LXD presentation at 2017 KubeCon Berlin. |
opencontainers/runc#774 has been merged and from what I last heard, @brauner is going to backport or otherwise apply opencontainers/runc@6bd4bd9 (which should fix the unprivileged problems noted above). |
Yes, I'm not the one directly responsible for this but @mwhudson does have a plan. :) |
Can I get a test case for the unprivileged problems please? And am I correct to think that there is no pending fix for "lxc exec docker -- docker run --privileged"? (Sorry, I find this report pretty confusing) |
@mwhudson I believe opencontainers/runc#774 is supposed to fix some of the issues, right @cyphar ? |
Please can someone help me with the steps to fix the docker: Error response from daemon: linux runtime spec devices: lstat /dev/.lxc/proc/1482/fdinfo/12: no such file or directory. error. I read that I have to use a specific runC binary |
@infinitydon For now, stick to the docker.io package from Ubuntu, not the one you download from Docker upstream otherwise things won't work. |
I'm closing this issue because there's nothing for LXD to do here. And we plan on adding daily testing of the dev branch of Docker to detect regression before they hit users. |
@stgraber , the same error appears with
What is the current workaround? |
Also running into this issue as I'm attempting to run Kubernetes inside of LXD containers. It appears there may be a fix in 17.06 docker, but that hasn't hit the edge repo yet: |
Kubernetes hasn't officially released support for docker > 1.13.x. that I'm aware of. That may change with the 1.7 release, so don't think that upgrading to the CE docker flavors of 17.x will be feasible at this time without introducing other unforeseen problems. |
Ahh, one step forward, two steps back. |
Hello, I am probably in over my head here. Trying to run docker via LX[CD]. Earlier it was mentioned to use docker.io versus other. That did not work here. I am on a Mac running this in a VM (Fusion). Any tips would be great. I am trying to run Swarm across several LXD instances in a VM.
With
|
I'm sorry to hijack this issue, but do we have some patch to fix this error?
|
I got this error, exit code 1. |
Issue description
When running an LXD container, whether in unprivileged or privileged mode, privileged Docker containers have trouble when trying to access process information. An example is WeaveScope.
Steps to reproduce
lxc launch ubuntu-daily:16.04 docker -p default -p docker
lxc exec docker -- apt install docker.io -y
lxc exec docker bash
sudo curl -L git.io/scope -o /usr/local/bin/scope
sudo chmod a+x /usr/local/bin/scope
scope launch
Error message from Weave Scope:
Error message when running Kubernetes Kube-deploy (https://github.com/kubernetes/kube-deploy/tree/master/docker-multinode):
Required information
The text was updated successfully, but these errors were encountered: