Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker containers failing in /dev/.lxc/proc directory #2825

Closed
CarltonSemple opened this issue Jan 26, 2017 · 55 comments
Closed

Docker containers failing in /dev/.lxc/proc directory #2825

CarltonSemple opened this issue Jan 26, 2017 · 55 comments
Labels
External Issue is about a bug/feature in another project

Comments

@CarltonSemple
Copy link

CarltonSemple commented Jan 26, 2017

Issue description

When running an LXD container, whether in unprivileged or privileged mode, privileged Docker containers have trouble when trying to access process information. An example is WeaveScope.

Steps to reproduce

  1. lxc launch ubuntu-daily:16.04 docker -p default -p docker
  2. lxc exec docker -- apt install docker.io -y
  3. lxc exec docker bash
  4. sudo curl -L git.io/scope -o /usr/local/bin/scope
    sudo chmod a+x /usr/local/bin/scope
    scope launch

Error message from Weave Scope:
screen shot 2017-01-26 at 2 16 42 pm

Error message when running Kubernetes Kube-deploy (https://github.com/kubernetes/kube-deploy/tree/master/docker-multinode):
screen shot 2017-01-26 at 2 47 27 pm

Required information

  • Distribution: Ubuntu
  • Distribution version: 16.04
  • The output of "lxc info":
lxc info
apiextensions:
- id_map
apistatus: stable
apiversion: "1.0"
auth: trusted
environment:
  addresses: []
  architectures:
  - x86_64
  - i686
  certificate: |
    -----BEGIN CERTIFICATE-----
    -----END CERTIFICATE-----
  certificatefingerprint: 5b1389c121a01ff313b99a2dbab00bfe375fcb9b137f39c1a5409dcd87bdafd5
  driver: lxc
  driverversion: 2.0.6
  kernel: Linux
  kernelarchitecture: x86_64
  kernelversion: 4.4.0-59-generic
  server: lxd
  serverpid: 1635
  serverversion: 2.0.8
  storage: dir
  storageversion: ""
config: {}
public: false
@niko-daniel
Copy link

Hi.
Make sure docker daemon service running properly inside your container, try to repair docker installation.

lxc exec docker -- apt install docker.io runc containerd

@CarltonSemple
Copy link
Author

Hi, so I did that just now. The response is the same:
screen shot 2017-01-26 at 11 02 57 pm

Here's the debug output:

root@c102:~# lxc exec --debug docker -- scope launch
DBUG[01-27|04:00:29] Raw response: {"type":"sync","status":"Success","status_code":200,"metadata":{"api_extensions":[],"api_status":"stable","api_version":"1.0","auth":"trusted","config":{"core.https_address":"[::]:8443"},"environment":{"addresses":["10.176.216.87:8443","169.46.71.26:8443","169.46.32.67:8443","10.212.93.1:8443","[fddc:c4f5:3955:8059::1]:8443","10.1.59.0:8443","10.1.59.1:8443"],"architectures":["x86_64","i686"],"certificate":"-----BEGIN CERTIFICATE-----\nMIIGnDCCBISgAwIBAgIQNLbmOp3Oah++jp/XNhaJsjANBgkqhkiG9w0BAQsFADA6\nMRwwGgYDVQQKExNsaW51eGNvbnRhaW5lcnMub3JnMRowGAYDVQQDDBFyb290QGMx\nMDIuaWJtLmNvbTAeFw0xNzAxMjYyMDM4MDFaFw0yNzAxMjQyMDM4MDFaMDoxHDAa\nBgNVBAoTE2xpbnV4Y29udGFpbmVycy5vcmcxGjAYBgNVBAMMEXJvb3RAYzEwMi5p\nYm0uY29tMIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAxQt44YZ7T6DC\nJu6I0XoRLUCSl+JHa36xdmmyGPnMF68wzsHj1L5F6sQgUIni1mK3eiGeZv95ZpcM\nJAcS7LcsRzyTycZ/2ELyrIvirFsVM36Nw3TnUxQQof+aKByUDYCBfvXCPt9Tzx74\njC8deoLcTwKGjFhaLGJKkQGAg26IDaKLC5K+BFHKvB5E0Y6m1fpshXtYtvBPgMCh\nYffAB/WxpWlY6JYQkdYxRNtVxQbuIj4uCP7gVfuH8dX4jxD4HwXk3xewQgRnf8N5\na0fEbFFhAj9cUUu/7lAvtFd3F4EaXNprfYgjCHDwBzcaEXSAF3K636YYU+v/x/sP\nXOm6TgHZP1x0oCXcScdNLpl1MmvRzMHAOXkQ7SJXbqs8IUv9c8hiTjHhE8nXVpOU\nQumjI9PasPkDls2EdJiB19PSn0QP9EOpcfTeYXdZRQO7dAjuNYmYZ25L9gZQ68op\ndONlPNHaBEnvNW6hMzgUiRpY2dQwVQ8VCtXxV7bEURFbPkk3BX/9nTx9mRcejoxD\nav9GF0wyCNvo5tEJZVslUXaCZrl4MQ2su5O4K7D7gSw73XxRI9p/fuYRS6e6DWJ1\neRTxM4apLfcdniv14vB7vQyv4HRTRZ6z+HF+La0ssD1GIIl1ImCW1t8qV6jNt8/0\nt67UutEiDk9L75mbyQT5pMtXg1xvMAcCAwEAAaOCAZwwggGYMA4GA1UdDwEB/wQE\nAwIFoDATBgNVHSUEDDAKBggrBgEFBQcDATAMBgNVHRMBAf8EAjAAMIIBYQYDVR0R\nBIIBWDCCAVSCDGMxMDIuaWJtLmNvbYIcZmU4MDo6OTBhNjpmZWZmOmZlMTQ6ZmYx\nMi82NIIQMTAuMTc2LjIxNi44Ny8yNoIbZmU4MDo6NDEzOjE3ZmY6ZmU0MTo2NGIx\nLzY0gg8xNjkuNDYuNzEuMjYvMjeCDzE2OS40Ni4zMi42Ny8zMoIaZmU4MDo6NGJm\nOjhjZmY6ZmVkZjo1ODAvNjSCG2ZlODA6OmMxNjpiNmZmOmZlZDM6MzZhNC82NIIc\nZmU4MDo6OTg0YjoxNmZmOmZlYmQ6MmVmYy82NIIKZmU4MDo6MS82NIIMMTAuMS41\nOS4wLzE2ggwxMC4xLjU5LjEvMjSCGmZlODA6OjQyOmM3ZmY6ZmVkZjoyODNmLzY0\nghxmZTgwOjpkMDE3OjVmZmY6ZmVlYTpjYmJhLzY0ghxmZTgwOjpmMGEzOjg3ZmY6\nZmUxMTpiMmVlLzY0MA0GCSqGSIb3DQEBCwUAA4ICAQBVORcyjdC5cacO88EhNCeT\nChM/7N6i2ZSVWbs/Uyq/7KfC2uOgEsXMCB5L5CvEHW5k9+ABa+XYXAX2rdWsSc//\nB4b5N8GnQcFHxD9n7p1hjRdS/V2PDFsYJoRxD7tOPhfxz5IjG9N5c3r/XvtPgV7X\nwPtQ2+nbNraPK9lHOJ+SFDYxrXxX3JT35BXyhX6KnE2Xwnc0iL/bgtwcd8qWQAMS\nzzcbOPM3QJ27glOkN7bvQ8KSwz1pvjOobwI/yJkE7YhetsjMa7gm/a9kbFOGLyg8\nDy7SN3cOj56jmkt7EO1tQeN3LcA7zi5Km7tu23bYhldOOwEUwmpQ8Ikn0Z8aaJC8\nPpKkexGpKMYIP6joH+397xfdAwxRUWStjzWbLPMHNt9Jz67ZnZBDcwvy4Ysct350\nbS0m5oIm+Jnrn+M3bW2m6pirWgy+KksFNsGNa1vAqB0csq2pzmwrC9Tuut1JaKL8\ngkxx1o2C4aYjPEKnPf3zhMhDAakZv9E23Gu50DMATp+ZL+i0ZLnPPt2ZgynPM3WR\nx05thMWKnz7e6LFxcTi0tW1vrDyIrYRQYsvxt3zaCrbfnP4wLjPfBwdXQp8KZx6F\nXOvXBlcX3B8O5NekHL8uPwQp56PYnQeTl/yBzzkXHBe1YUl59QTt7Q3ncmCYSS+i\ngtL42kYuaE6OaDpk7I5pAw==\n-----END CERTIFICATE-----\n","driver":"lxc","driver_version":"2.0.5","kernel":"Linux","kernel_architecture":"x86_64","kernel_version":"4.4.0-45-generic","server":"lxd","server_pid":15078,"server_version":"2.0.5","storage":"dir","storage_version":""},"public":false}}
 
DBUG[01-27|04:00:29] Posting {"command":["scope","launch"],"environment":{"HOME":"/root","TERM":"xterm-256color","USER":"root"},"height":62,"interactive":true,"wait-for-websocket":true,"width":125}
                                                                         to http://unix.socket/1.0/containers/docker/exec 
                                                                                                                          DBUG[01-27|04:00:29] Raw response: {"type":"async","status":"Operation created","status_code":100,"metadata":{"id":"357ea5c2-98b6-4ece-9bc8-a82b2fb2c196","class":"websocket","created_at":"2017-01-27T04:00:29.275585835Z","updated_at":"2017-01-27T04:00:29.275585835Z","status":"Running","status_code":103,"resources":{"containers":["/1.0/containers/docker"]},"metadata":{"fds":{"0":"1e7d2045a84d4e19d4e3372f84eeaa58bb8c29d29b40e4ed14ab4043a421db69","control":"09e3d36b9860f9b29e7290c52899f8983eaa30334b55ea5f43f0974d6947dd86"}},"may_cancel":false,"err":""},"operation":"/1.0/operations/357ea5c2-98b6-4ece-9bc8-a82b2fb2c196"}
                                                                                                                        
                                                                                                                        docker: Error response from daemon: linux runtime spec devices: lstat /dev/.lxc/proc/1482/fdinfo/12: no such file or directory.
DBUG[01-27|04:00:30] got message barrier 
                                         DBUG[01-27|04:00:30] 1.0/operations/357ea5c2-98b6-4ece-9bc8-a82b2fb2c196/wait 
                                                                                                                       DBUG[01-27|04:00:30] Raw response: {"type":"sync","status":"Success","status_code":200,"metadata":{"id":"357ea5c2-98b6-4ece-9bc8-a82b2fb2c196","class":"websocket","created_at":"2017-01-27T04:00:29.275585835Z","updated_at":"2017-01-27T04:00:30.378503307Z","status":"Success","status_code":200,"resources":{"containers":["/1.0/containers/docker"]},"metadata":{"return":127},"may_cancel":false,"err":""}}

@CarltonSemple
Copy link
Author

Sorry, to clarify: I switched to a container where the docker installation wasn't broken

@niko-daniel
Copy link

I'm trying to reproduce it and got similar error
screenshot - 270117 - 11 12 09

@CarltonSemple
Copy link
Author

@niko-daniel i'm not too familiar with the code yet unfortunately. Do you think this issue is more appropriate for the https://github.com/lxc/lxc repository?

@stgraber
Copy link
Contributor

@CarltonSemple no, this is done using LXD so this repository is the right place for this report.

@stgraber
Copy link
Contributor

Sounds like Docker is confused by the mount table in the LXD container. We'll have to take a look and may have to file an issue with Docker upstream for this.

@stgraber
Copy link
Contributor

@mwhudson @brauner does that ring a bell with either of you?

@mwhudson
Copy link

No. But it's trivial to reproduce:

$ lxc exec docker -- docker run --privileged hello-world
docker: Error response from daemon: linux runtime spec devices: open /dev/.lxc/proc/145/fdinfo: permission denied.
ERRO[0000] error getting events from daemon: net/http: request canceled 

This turns out to be with the docker upstream 1.13.0-rc1 package installed in the container, because that's what I happened to have lying around. Can try with newer or Ubuntu packaging but somehow doubt that's going to make a difference.

@mwhudson
Copy link

mwhudson commented Jan 29, 2017 via email

@CarltonSemple
Copy link
Author

Just so you know, I had another issue that is probably related. (in LXD container), Docker pull of some images doesn't complete: moby/moby#30569

@CarltonSemple
Copy link
Author

Hi @stgraber, do you have any temporary workarounds, or a rough estimate of when this might be worked on?

@CarltonSemple
Copy link
Author

Or do you have any pointers on where to look so I could help?
@stgraber

@stgraber
Copy link
Contributor

stgraber commented Feb 8, 2017

Hmm, so I'm not sure there's an easy workaround for this other than having Docker fixed to not use paths it can't access...

/dev/.lxc/proc and /dev/.lxc/sys are paths mounted by LXC to allow for container nesting in environments where the kernel would deny mounting of new proc and sys instances due to the overmounting protection kicking in. Unmounting those would prevent spawning nested containers and allowing the read and writes under those paths would be a potential security issue.

The correct fix here is to fix Docker with one of:

  • Just assume that proc is mounted at /proc and sys at /sys, it's not like they can really be mounted somewhere else and not at those locations without breaking the system.
  • Keep their current logic but skip /dev/.lxc
  • Keep their current logic and skip to the next match on permission denied

@stgraber
Copy link
Contributor

stgraber commented Feb 8, 2017

@mwhudson sounds like we may want an extra test for --privileged in our autopkgtest and a patch to fix this issue ^

@CarltonSemple
Copy link
Author

@stgraber Agreed. I'm doing some investigating right now, building docker myself. The error comes when the daemon tries to read process information about itself, in /dev/.lxc/proc/<daemon_pid>.

@stgraber
Copy link
Contributor

stgraber commented Feb 8, 2017

@CarltonSemple what's odd is that it only does so when --privileged is passed? I'd think that docker would access /proc/self at some other point too... Anyway, I still find it weird that they go through all the trouble of scanning /proc/mounts (or mountinfo) for all instances of /proc rather than just pick the standard path for it...

@CarltonSemple
Copy link
Author

CarltonSemple commented Feb 8, 2017

@stgraber Yes. The function where the behavior changes is setDevices() in /docker/daemon/oci_linux.go. The code that breaks is actually from the OpenContainers repository in /runc/libcontainer/devices/devices_unix.go.

I was able to add ".lxc" to the skipped directories in the getDevices() function, getting past the "no such file or directory" error.
Now it runs into a permission error. This is the docker daemon output, with some extra debug statements :

INFO[0001] Loading containers: done.                    
INFO[0001] Daemon has completed initialization          
INFO[0001] Docker daemon                                 commit=1564f02-unsupported graphdriver=overlay version=1.12.4
INFO[0001] API listen on /var/run/docker.sock           
Privileged...
skipping  .lxc
skipping  .lxd-mounts
appending
appending
/dev / hugepages
skipping  lxd
skipping  mqueue
/dev / net
appending
appending
skipping  pts
appending
skipping  shm
appending
appending
appending
ERRO[0010] containerd: start container                   error=oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:291: setting cgroup config for ready process caused \"failed to write a *:* rwm to devices.allow: write /sys/fs/cgroup/devices/docker/997a8faece21427b2efe84c901978b178c3cbcb67da6787c2782ba23750c8a21/devices.allow: operation not permitted\""
 id=997a8faece21427b2efe84c901978b178c3cbcb67da6787c2782ba23750c8a21
ERRO[0010] Create container failed with error: invalid header field value "oci runtime error: container_linux.go:247: starting container process caused \"process_linux.go:291: setting cgroup config for ready process caused \\\"failed to write a *:* rwm to devices.allow: write /sys/fs/cgroup/devices/docker/997a8faece21427b2efe84c901978b178c3cbcb67da6787c2782ba23750c8a21/devices.allow: operation not permitted\\\"\"\n" 
ERRO[0010] Handler for POST /v1.24/containers/997a8faece21427b2efe84c901978b178c3cbcb67da6787c2782ba23750c8a21/start returned error: invalid header field value "oci runtime error: container_linux.go:247: starting container process caused \"process_linux.go:291: setting cgroup config for ready process caused \\\"failed to write a *:* rwm to devices.allow: write /sys/fs/cgroup/devices/docker/997a8faece21427b2efe84c901978b178c3cbcb67da6787c2782ba23750c8a21/devices.allow: operation not permitted\\\"\"\n" 

@stgraber
Copy link
Contributor

stgraber commented Feb 8, 2017

Ok, that one is expected too, you can't write to devices.allow or devices.deny from inside a user namespace. I'm guessing Docker already has logic to deal with this somewhere else, otherwise normal containers would fail too.

@CarltonSemple
Copy link
Author

It looks related to this older issue: https://bugs.launchpad.net/ubuntu/+source/lxd/+bug/1593301

@stgraber
Copy link
Contributor

stgraber commented Feb 8, 2017

Though that was only showing up when the LXD container itself was privileged, that shouldn't happen when the LXD container is unprivileged, unless Docker's --privileged bypasses whatever code was avoid that cgroup config.

@CarltonSemple
Copy link
Author

CarltonSemple commented Feb 9, 2017

Gotcha. So I switched back to an unprivileged LXD container, and that error no longer shows up.
Now I'm getting what seems to be an issue with Docker inside the LXD container not being able to view all of a process's information.

root@docker2:~# ./dockerd-1.13.0 
INFO[0000] libcontainerd: new containerd process, pid: 475 
INFO[0001] [graphdriver] using prior storage driver: overlay 
INFO[0001] Graph migration to content-addressability took 0.00 seconds 
WARN[0001] Your kernel does not support swap memory limit. 
WARN[0001] Your kernel does not support cgroup rt period 
WARN[0001] Your kernel does not support cgroup rt runtime 
INFO[0001] Loading containers: start.                   
WARN[0001] Running modprobe bridge br_netfilter failed with message: modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-59-generic/modules.dep.bin'
modprobe: WARNING: Module bridge not found in directory /lib/modules/4.4.0-59-generic
modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-59-generic/modules.dep.bin'
modprobe: WARNING: Module br_netfilter not found in directory /lib/modules/4.4.0-59-generic
, error: exit status 1 
WARN[0001] Running modprobe nf_nat failed with message: `modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-59-generic/modules.dep.bin'
modprobe: WARNING: Module nf_nat not found in directory /lib/modules/4.4.0-59-generic`, error: exit status 1 
WARN[0001] Running modprobe xt_conntrack failed with message: `modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-59-generic/modules.dep.bin'
modprobe: WARNING: Module xt_conntrack not found in directory /lib/modules/4.4.0-59-generic`, error: exit status 1 
INFO[0001] Firewalld running: false                     
INFO[0001] Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address 
INFO[0002] Loading containers: done.                    
INFO[0002] Daemon has completed initialization          
INFO[0002] Docker daemon                                 commit=49bf474-unsupported graphdriver=overlay version=1.13.0
INFO[0002] API listen on /var/run/docker.sock           
WARN[0103] Your kernel does not support swap memory limit. 
WARN[0103] Your kernel does not support cgroup rt period 
WARN[0103] Your kernel does not support cgroup rt runtime 
ERRO[0103] containerd: start container                   error=oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:252: getting pipe fds for pid 631 caused \"readlink /proc/631/fd/0: permission denied\""
 id=b638a5975734e527a8c15fc24b5c6222c8ceced1414c7657a6a4362ebb5ed95c
ERRO[0103] Create container failed with error: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:252: getting pipe fds for pid 631 caused \"readlink /proc/631/fd/0: permission denied\""
 
ERRO[0103] containerd: deleting container                error=exit status 1: "container b638a5975734e527a8c15fc24b5c6222c8ceced1414c7657a6a4362ebb5ed95c does not exist\none or more of the container deletions failed\n"
ERRO[0104] Handler for POST /v1.25/containers/b638a5975734e527a8c15fc24b5c6222c8ceced1414c7657a6a4362ebb5ed95c/start returned error: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:252: getting pipe fds for pid 631 caused \"readlink /proc/631/fd/0: permission denied\""

I'm assuming the module errors are okay, after reading https://github.com/lxc/lxd/issues/2321

What's weird is that the root user inside of the container can go into the /proc/$(pid)/fd directory, but doesn't seem to have access to all of the files.

root@docker2:/proc/469/fd# ls
0  1  10  11  12  13  2  3  4  5  6  7  8  9
root@docker2:/proc/469/fd# cat 2

^C
root@docker2:/proc/469/fd# cat 4
cat: 4: No such device or address
root@docker2:/proc/469/fd# ls
0  1  10  11  12  13  2  3  4  5  6  7  8  9
root@docker2:/proc/469/fd# cat 5
cat: 5: Permission denied
root@docker2:/proc/469/fd# 

@stgraber
Copy link
Contributor

stgraber commented Feb 9, 2017

That's indeed a bit weird. Can you try to ssh into the container or run docker through a "script /dev/null" session perhaps?

I want to check if maybe the problem is to do with the pts device that you get from LXD when doing "lxc exec" (which coming from the host would be owned by the host and so may cause those permission issues).

@mwhudson
Copy link

mwhudson commented Feb 9, 2017

@mwhudson sounds like we may want an extra test for --privileged in our autopkgtest and a patch to fix this issue ^

The extra test is trivial, as is adding a patch or two, once we have them...

@CarltonSemple
Copy link
Author

@stgraber you mean like this? I also tried building off of git checkout tags/v1.12.3 to get closer to the default apt install docker.io, but I think I might have the wrong version. Dockerd was complaining about not being able to connect to containerd, so that's why I'm building off of git checkout tags/v1.13.0.

root@docker3:~# script /dev/null
Script started, file is /dev/null
# ./dockerd-1.13.0
INFO[0000] libcontainerd: new containerd process, pid: 2854 
INFO[0001] [graphdriver] using prior storage driver: overlay 
INFO[0001] Graph migration to content-addressability took 0.00 seconds 
WARN[0001] Your kernel does not support swap memory limit. 
WARN[0001] Your kernel does not support cgroup rt period 
WARN[0001] Your kernel does not support cgroup rt runtime 
INFO[0001] Loading containers: start.                   
WARN[0001] Running modprobe bridge br_netfilter failed with message: modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-59-generic/modules.dep.bin'
modprobe: WARNING: Module bridge not found in directory /lib/modules/4.4.0-59-generic
modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-59-generic/modules.dep.bin'
modprobe: WARNING: Module br_netfilter not found in directory /lib/modules/4.4.0-59-generic
, error: exit status 1 
WARN[0001] Running modprobe nf_nat failed with message: `modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-59-generic/modules.dep.bin'
modprobe: WARNING: Module nf_nat not found in directory /lib/modules/4.4.0-59-generic`, error: exit status 1 
WARN[0001] Running modprobe xt_conntrack failed with message: `modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-59-generic/modules.dep.bin'
modprobe: WARNING: Module xt_conntrack not found in directory /lib/modules/4.4.0-59-generic`, error: exit status 1 
INFO[0001] Firewalld running: false                     
INFO[0001] Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address 
INFO[0001] Loading containers: done.                    
INFO[0001] Daemon has completed initialization          
INFO[0001] Docker daemon                                 commit=49bf474-unsupported graphdriver=overlay version=1.13.0
INFO[0001] API listen on /var/run/docker.sock           
WARN[0042] Your kernel does not support swap memory limit. 
WARN[0042] Your kernel does not support cgroup rt period 
WARN[0042] Your kernel does not support cgroup rt runtime 
ERRO[0042] containerd: start container                   error=oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:252: getting pipe fds for pid 3002 caused \"readlink /proc/3002/fd/0: permission denied\""
 id=96eee0b4ec4e56336f030bc34b881121f1c7c109c13bf2265fa166838a89cb06
ERRO[0042] Create container failed with error: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:252: getting pipe fds for pid 3002 caused \"readlink /proc/3002/fd/0: permission denied\""
 
ERRO[0042] containerd: deleting container                error=exit status 1: "container 96eee0b4ec4e56336f030bc34b881121f1c7c109c13bf2265fa166838a89cb06 does not exist\none or more of the container deletions failed\n"
ERRO[0042] Handler for POST /v1.25/containers/96eee0b4ec4e56336f030bc34b881121f1c7c109c13bf2265fa166838a89cb06/start returned error: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:252: getting pipe fds for pid 3002 caused \"readlink /proc/3002/fd/0: permission denied\""

@CarltonSemple
Copy link
Author

The error looks like it's coming from https://github.com/opencontainers/runc/blob/master/libcontainer/process_linux.go#L247 , and might be related to opencontainers/runc#1130

@CarltonSemple
Copy link
Author

@stgraber is the code for the LXD version of Docker on Github anywhere? I want to make sure I'm not running into unrelated errors.

@stgraber
Copy link
Contributor

@CarltonSemple I don't believe @mwhudson maintains it in git but you can go to https://launchpad.net/ubuntu/+source/docker.io and grab the debian.tar.xz tarball for the latest release there, you'll find a debian/patches directory which includes all the patches that Ubuntu's docker has on top of the clean upstream tarball.

@mwhudson
Copy link

mwhudson commented Feb 13, 2017 via email

@CarltonSemple
Copy link
Author

Can we confirm that opencontainers/runc#1327 solves the original problem? In any of the docker.io packages at https://launchpad.net/ubuntu/+source/docker.io, the runC code of interest is in the vendor directory of docker. @mwhudson @stgraber

@cyphar
Copy link
Contributor

cyphar commented Feb 16, 2017

@CarltonSemple

With regards to this issue:

What's weird is that the root user inside of the container can go into the /proc/$(pid)/fd directory, but doesn't seem to have access to all of the files.

That's to be expected. Trying to open and then read from a bunch of random file descriptors a process has open is not always going to work -- what would it mean to read from a socket(2) descriptor for example? Same goes for some pty related file descriptors as well.

@CarltonSemple
Copy link
Author

@cyphar gotcha. I was able to get past that error though, using an update to runc. I now have Docker 1.13 unprivileged containers running.

Have you been able to verify opencontainers/runc#1327 ? I am not familiar with debian packaging, so the best I could do was download the docker.orig.tar.gz and try building that. opencontainers/runc#1327 gets me past the docker: Error response from daemon: linux runtime spec devices: lstat /dev/.lxc/proc/1482/fdinfo/12: no such file or directory. error.

I was hoping we could at least cover that error in this issue and open another issue for privileged Docker containers.

@cyphar
Copy link
Contributor

cyphar commented Feb 16, 2017

opencontainers/runc#1327 looks like an okay change to me, and I'll LGTM it once I figure out why the CI broke (it's probably not because of your change). It should fix your issue with Docker, but I would recommend verifying that yourself (try replacing docker-runc with a new runC build).

@hqhq
Copy link

hqhq commented Feb 17, 2017

@cyphar CI failure is related to opencontainers/runc#1237

@CarltonSemple
Copy link
Author

CarltonSemple commented Feb 17, 2017

@cyphar yes, I included the opencontainers/runc#1327 change when building dockerd, and that fixed docker: Error response from daemon: linux runtime spec devices: lstat /dev/.lxc/proc/1482/fdinfo/12: no such file or directory without needing to replace docker-runc. The error comes from dockerd using runC code.

As for docker-runc's error error=oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:252: getting pipe fds for pid 3002 caused \"readlink /proc/3002/fd/0: permission denied\"", you may find the patch in this ubuntu discussion helpful. The patch there looks related to what @hqhq mentioned.

Using that, I was able to get past error=oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:252: getting pipe fds for pid 3002 caused \"readlink /proc/3002/fd/0: permission denied\"" (see my version of nsexec.c here).

Now I have the error containerd: start container error=oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:359: container init caused \"rootfs_linux.go:65: creating device nodes caused \\\"open /var/lib/docker/overlay/535dce229186dc678699d8df1a6121878da5feb6f205fe4f75ea731616ec807f/merged/dev/tty: no such device or address\\\"\"" id=682a74a9ea55ec105c78a06b5b40590c6587f5a376236518c683b0d9f8621f73 when attempting to start a privileged Docker container.

@CarltonSemple
Copy link
Author

@stgraber does that last error look like more of an LXD issue?

@stgraber
Copy link
Contributor

@CarltonSemple not sure, it says it's creating devnice nodes, but the error suggests it attempted to open it for read/write

/dev/tty in LXD containers should work fine so I'm not sure why the one inside the docker container doesn't.
Can you confirm that "cat /dev/tty" in your LXD container works properly (that is, doesn't print an error and hangs there)?

@cyphar
Copy link
Contributor

cyphar commented Feb 17, 2017

@CarltonSemple I would very strongly recommend against using the line you linked -- it makes runC vulnerable to race conditions that allow processes to access the host (see CVE-2016-9962). I'm working on making this code safe and work properly -- please look at opencontainers/runc#774 and see if you can cherry-pick the changes to nsexec.c and process_linux.go (though they are substantial).

As for the creating device nodes issue, I have the feeling that error you're seeing (ENXIO) is something coming from overlayfs (I'm not sure why an mknod would fail in that manner though, it's quite odd). I'm not sure exactly where, you'll have to debug your setup yourself.

@cyphar
Copy link
Contributor

cyphar commented Feb 17, 2017

@brauner Since you wrote the patch I would recommend reading the above. The stuff I mentioned at FOSDEM about PRCTL_SET_DUMPABLE with rootless containers have been fixed in opencontainers/runc#774. I can help you split out the key parts of the patch so you get the most important bits. ^^

@stgraber stgraber added the External Issue is about a bug/feature in another project label Mar 8, 2017
@mbruzek
Copy link

mbruzek commented Mar 29, 2017

This issue was noted in a LXC/LXD presentation at 2017 KubeCon Berlin.

@cyphar
Copy link
Contributor

cyphar commented Mar 29, 2017

opencontainers/runc#774 has been merged and from what I last heard, @brauner is going to backport or otherwise apply opencontainers/runc@6bd4bd9 (which should fix the unprivileged problems noted above).

@brauner
Copy link
Contributor

brauner commented Mar 29, 2017

Yes, I'm not the one directly responsible for this but @mwhudson does have a plan. :)

@mwhudson
Copy link

Can I get a test case for the unprivileged problems please?

And am I correct to think that there is no pending fix for "lxc exec docker -- docker run --privileged"?

(Sorry, I find this report pretty confusing)

@CarltonSemple
Copy link
Author

@mwhudson I believe opencontainers/runc#774 is supposed to fix some of the issues, right @cyphar ?
I will check for myself soon

@infinitydon
Copy link

Please can someone help me with the steps to fix the docker: Error response from daemon: linux runtime spec devices: lstat /dev/.lxc/proc/1482/fdinfo/12: no such file or directory. error.

I read that I have to use a specific runC binary

@stgraber
Copy link
Contributor

@infinitydon For now, stick to the docker.io package from Ubuntu, not the one you download from Docker upstream otherwise things won't work.

@stgraber
Copy link
Contributor

I'm closing this issue because there's nothing for LXD to do here.
That's not to say we won't be doing anything. @mwhudson is still looking at including @cyphar's fixes in the Ubuntu package to improve over our current workaround. @cyphar's work upstream should also ensure less regression of Docker in LXD and hopefully means that the next upstream Docker build will work unmodified in LXD containers.

And we plan on adding daily testing of the dev branch of Docker to detect regression before they hit users.

@VelorumS
Copy link

VelorumS commented Apr 20, 2017

@infinitydon For now, stick to the docker.io package from Ubuntu, not the one you download from Docker upstream otherwise things won't work.

@stgraber , the same error appears with docker.io=1.12.6-0ubuntu1~16.04.1, runc=1.0.0~rc2-0ubuntu2~16.04.1: while using kubeadm the kube-proxy can't start because of

Error response from daemon: {"message":"linux runtime spec devices: lstat /dev/.lxc/proc/14326/fdinfo/12: no such file or directory"}

What is the current workaround?

@carpenike
Copy link

Also running into this issue as I'm attempting to run Kubernetes inside of LXD containers.

It appears there may be a fix in 17.06 docker, but that hasn't hit the edge repo yet:

moby/moby#32968

@lazypower
Copy link

Kubernetes hasn't officially released support for docker > 1.13.x. that I'm aware of. That may change with the 1.7 release, so don't think that upgrading to the CE docker flavors of 17.x will be feasible at this time without introducing other unforeseen problems.

@carpenike
Copy link

Ahh, one step forward, two steps back.

@ronaldpetty
Copy link

ronaldpetty commented Jun 7, 2017

Hello, I am probably in over my head here. Trying to run docker via LX[CD]. Earlier it was mentioned to use docker.io versus other. That did not work here. I am on a Mac running this in a VM (Fusion). Any tips would be great. I am trying to run Swarm across several LXD instances in a VM.

user@ubuntu:~$ lxc exec nodef -- sh -c 'apt-get update && apt-get install docker.io && docker version && docker run hello-world'
Get:1 http://security.ubuntu.com/ubuntu xenial-security InRelease [102 kB]
...                                  
Get:42 http://archive.ubuntu.com/ubuntu xenial-backports/universe Translation-en [2872 B]                                     
Fetched 23.9 MB in 53s (442 kB/s)                                                                                             
Reading package lists... Done
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  bridge-utils cgroupfs-mount containerd runc ubuntu-fan
Suggested packages:
  mountall aufs-tools debootstrap docker-doc rinse zfs-fuse | zfsutils
The following NEW packages will be installed:
  bridge-utils cgroupfs-mount containerd docker.io runc ubuntu-fan
0 upgraded, 6 newly installed, 0 to remove and 20 not upgraded.
Need to get 16.4 MB of archives.
After this operation, 83.6 MB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://archive.ubuntu.com/ubuntu xenial/main amd64 bridge-utils amd64 1.5-9ubuntu1 [28.6 kB]
...   
Get:6 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 ubuntu-fan all 0.9.2 [30.7 kB]                               
Fetched 16.4 MB in 25s (644 kB/s)                                                                                             
Selecting previously unselected package bridge-utils.
(Reading database ... 25504 files and directories currently installed.)
Preparing to unpack .../bridge-utils_1.5-9ubuntu1_amd64.deb ...
Unpacking bridge-utils (1.5-9ubuntu1) ...
Selecting previously unselected package cgroupfs-mount.
Preparing to unpack .../cgroupfs-mount_1.2_all.deb ...
Unpacking cgroupfs-mount (1.2) ...
Selecting previously unselected package runc.
Preparing to unpack .../runc_1.0.0~rc2-0ubuntu2~16.04.1_amd64.deb ...
Unpacking runc (1.0.0~rc2-0ubuntu2~16.04.1) ...
Selecting previously unselected package containerd.
Preparing to unpack .../containerd_0.2.5-0ubuntu1~16.04.1_amd64.deb ...
Unpacking containerd (0.2.5-0ubuntu1~16.04.1) ...
Selecting previously unselected package docker.io.
Preparing to unpack .../docker.io_1.12.6-0ubuntu1~16.04.1_amd64.deb ...
Unpacking docker.io (1.12.6-0ubuntu1~16.04.1) ...
Selecting previously unselected package ubuntu-fan.
Preparing to unpack .../ubuntu-fan_0.9.2_all.deb ...
Unpacking ubuntu-fan (0.9.2) ...
Processing triggers for man-db (2.7.5-1) ...
Processing triggers for ureadahead (0.100.0-19) ...
Processing triggers for systemd (229-4ubuntu17) ...
Setting up bridge-utils (1.5-9ubuntu1) ...
Setting up cgroupfs-mount (1.2) ...
Setting up runc (1.0.0~rc2-0ubuntu2~16.04.1) ...
Setting up containerd (0.2.5-0ubuntu1~16.04.1) ...
Setting up docker.io (1.12.6-0ubuntu1~16.04.1) ...
Adding group `docker' (GID 116) ...
Done.
Setting up ubuntu-fan (0.9.2) ...
Processing triggers for systemd (229-4ubuntu17) ...
Processing triggers for ureadahead (0.100.0-19) ...
Client:
 Version:      1.12.6
 API version:  1.24
 Go version:   go1.6.2
 Git commit:   78d1802
 Built:        Tue Jan 31 23:35:14 2017
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.6
 API version:  1.24
 Go version:   go1.6.2
 Git commit:   78d1802
 Built:        Tue Jan 31 23:35:14 2017
 OS/Arch:      linux/amd64
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
78445dd45222: Pull complete 
Digest: sha256:c5515758d4c5e1e838e9cd307f6c6a0d620b5e07e6f927b07d05f6d12a1ac8d7
Status: Downloaded newer image for hello-world:latest
container_linux.go:247: starting container process caused "process_linux.go:359: container init caused \"rootfs_linux.go:42: preparing rootfs caused \\\"permission denied\\\"\""
docker: Error response from daemon: containerd: container not started.

With --privileged:

user@ubuntu:~$ lxc exec nodef -- sh -c 'docker run --privileged hello-world'
docker: Error response from daemon: linux runtime spec devices: open /dev/.lxd-mounts: permission denied.
user@ubuntu:~$

@benoitjpnet
Copy link

I'm sorry to hijack this issue, but do we have some patch to fix this error?

oci runtime error: container_linux.go:262: starting container process caused "process_linux.go:339: container init caused \"rootfs_linux.go:69: creating device nodes caused \\\"open /var/lib/docker/vfs/dir/6bfe63ddb9d78aa0f53bbf7e3de31271c321a052cb2559fa2275d94c40166997/dev/tty: no such device or address\\\"\""

@cheako
Copy link

cheako commented Sep 28, 2017

I got this error, exit code 1.
I setup a lib-virt lxc with Debian sid and installed docker.io.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
External Issue is about a bug/feature in another project
Projects
None yet
Development

No branches or pull requests