mount: add enhanced mount functionality to support run container in userns with host network #3613

shidao1 · 2022-09-27T06:41:13Z

in the public cloud service product, serverless container running environment has some specials.

the container is often running on a separate kernel.
so the runc is work on host network mode.
the container has no privilege permission, and also has no cap_sys_admin
many data accelerate project will use mount.fuse to provide mount point for app access data

the purpose is the container running in a new userns on host network mode.
the main process is using syscall open_tree to get fd for mount point sys, proc, mqueue beforce runc switch to new user ns
and using move_mount to mount sys, proc, mqueue after runc switch to new user ns

Extend bootstrap message to pass mount fds for open_tree()/move_mount(). Signed-off-by: Jiang Liu <[email protected]>

Enhance mountToRootfs() to support MoveMount(), so it could be used to support cross user namespace mounting. Signed-off-by: Jiang Liu <[email protected]>

Introduce struct namespace_info_t to split join_namespaces() in stages, so it could be reused later. Signed-off-by: Jiang Liu <[email protected]>

Prepare source mount fds for move_mount() to support cross user namespace mounting. Signed-off-by: Jiang Liu <[email protected]>

When a user namespace is enabled for a pod/container, it may fail to mount /proc, /sys and /dev/mqueue under certain conditions. This may be solved by enabling cross user namespace mounting. Signed-off-by: Jiang Liu <[email protected]> Signed-off-by: shidao.ytt <[email protected]>

kolyshkin · 2022-09-28T18:15:27Z

libcontainer/rootfs_linux.go

@@ -400,12 +400,25 @@ func mountToRootfs(m *configs.Mount, c *mountConfig) error {
 			return err
 		}
 		// Selinux kernels do not support labeling of /proc or /sys
+		if m.IsMove() && *c.fd >= 0 {


Nit: unlike in C, you don't have to dereference a pointer to a struct when accessing its members.

IOW s/*c.fd/c.fd/

kolyshkin · 2022-09-28T18:17:33Z

libcontainer/rootfs_linux.go

@@ -400,12 +400,25 @@ func mountToRootfs(m *configs.Mount, c *mountConfig) error {
 			return err
 		}
 		// Selinux kernels do not support labeling of /proc or /sys


This commit makes the comment above ^^^ misplaced. It used to explain why label.SetFileLabel is not called here.

kolyshkin · 2022-09-28T18:20:38Z

libcontainer/container_linux.go

+				if _, exist := nsList[configs.NEWPID]; exist {
+				}


Is some code missing here?

kolyshkin · 2022-09-28T18:21:43Z

We need test cases for this.

kolyshkin · 2022-09-28T22:14:43Z

Also, I'm afraid you'll have to redo this once #3599 is merged, which refactors some C code in nsenter.

jiangliu · 2022-09-29T02:58:09Z

Also, I'm afraid you'll have to redo this once #3599 is merged, which refactors some C code in nsenter.

Thanks for reminder, we also noticed PR #3599, so open this PR:)
We will wait for #3599 to settle down first.

Zheaoli · 2023-11-12T05:43:22Z

I think this PR is not active for long time, may I take handle the rest work for making this PR ready to merge? cc @AkihiroSuda

cyphar · 2023-11-16T00:19:29Z

@Zheaoli You'll need to base it on top of #3985, which reworks all of the mountfd logic. I'm not sure how easy it'll be to use the new Go-based setup to implement this though. I suspect you can do it by creating a locked goroutine that joins the container's non-userns namespaces, but the slight issue is that we cannot create a procfs mount that uses the containers pidns because procfs uses the active pidns, not the for_children one (in fact, I'm not sure this PR handles procfs correctly).

Also, you don't want to use open_tree(2) like this -- a much better way is to use fsopen and fsconfig to configure the mount without touching the filesystem, and thus having an anonymous mountfd that you can then provide to the container. (To be fair, I'm not sure if the permissions work out okay with user namespaces in that case.)

jiangliu and others added 5 commits September 27, 2022 12:58

extend bootstrap message to pass mount fds

08c266d

Extend bootstrap message to pass mount fds for open_tree()/move_mount(). Signed-off-by: Jiang Liu <[email protected]>

rootfs_linux: enable support of MoveMount()

56d878e

Enhance mountToRootfs() to support MoveMount(), so it could be used to support cross user namespace mounting. Signed-off-by: Jiang Liu <[email protected]>

nsexec: split join_namespaces() into stages

844b804

Introduce struct namespace_info_t to split join_namespaces() in stages, so it could be reused later. Signed-off-by: Jiang Liu <[email protected]>

nsexec: prepare mount fds for cross user namespace mounting

2d5bdfb

Prepare source mount fds for move_mount() to support cross user namespace mounting. Signed-off-by: Jiang Liu <[email protected]>

kolyshkin reviewed Sep 28, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mount: add enhanced mount functionality to support run container in userns with host network #3613

mount: add enhanced mount functionality to support run container in userns with host network #3613

shidao1 commented Sep 27, 2022 •

edited

Loading

kolyshkin Sep 28, 2022

kolyshkin Sep 28, 2022

kolyshkin Sep 28, 2022

kolyshkin commented Sep 28, 2022

kolyshkin commented Sep 28, 2022

jiangliu commented Sep 29, 2022

Zheaoli commented Nov 12, 2023

cyphar commented Nov 16, 2023

mount: add enhanced mount functionality to support run container in userns with host network #3613

Are you sure you want to change the base?

mount: add enhanced mount functionality to support run container in userns with host network #3613

Conversation

shidao1 commented Sep 27, 2022 • edited Loading

kolyshkin Sep 28, 2022

Choose a reason for hiding this comment

kolyshkin Sep 28, 2022

Choose a reason for hiding this comment

kolyshkin Sep 28, 2022

Choose a reason for hiding this comment

kolyshkin commented Sep 28, 2022

kolyshkin commented Sep 28, 2022

jiangliu commented Sep 29, 2022

Zheaoli commented Nov 12, 2023

cyphar commented Nov 16, 2023

shidao1 commented Sep 27, 2022 •

edited

Loading