Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runc create/run: warn on rootless + shared pidns + no cgroup #4398

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions libcontainer/cgroups/cgroups.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,11 @@ var (
// is not configured to set device rules.
ErrDevicesUnsupported = errors.New("cgroup manager is not configured to set device rules")

// ErrRootless is returned by [Manager.Apply] when there is an error
// creating cgroup directory, and cgroup.Rootless is set. In general,
// this error is to be ignored.
ErrRootless = errors.New("cgroup manager can not access cgroup (rootless container)")

// DevicesSetV1 and DevicesSetV2 are functions to set devices for
// cgroup v1 and v2, respectively. Unless
// [github.com/opencontainers/runc/libcontainer/cgroups/devices]
Expand Down
5 changes: 3 additions & 2 deletions libcontainer/cgroups/fs/fs.go
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ func isIgnorableError(rootless bool, err error) bool {
return false
}

func (m *Manager) Apply(pid int) (err error) {
func (m *Manager) Apply(pid int) (retErr error) {
m.mu.Lock()
defer m.mu.Unlock()

Expand All @@ -129,14 +129,15 @@ func (m *Manager) Apply(pid int) (err error) {
// later by Set, which fails with a friendly error (see
// if path == "" in Set).
if isIgnorableError(c.Rootless, err) && c.Path == "" {
retErr = cgroups.ErrRootless
Copy link
Member

@lifubang lifubang Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this is too strict? Maybe we should add a condition here:

if name == "devices" {
    retErr = cgroups.ErrRootless
}

delete(m.paths, name)
continue
}
return err
}

}
return nil
return retErr
}

func (m *Manager) Destroy() error {
Expand Down
2 changes: 1 addition & 1 deletion libcontainer/cgroups/fs2/fs2.go
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ func (m *Manager) Apply(pid int) error {
if m.config.Rootless {
if m.config.Path == "" {
if blNeed, nErr := needAnyControllers(m.config.Resources); nErr == nil && !blNeed {
return nil
return cgroups.ErrRootless
}
return fmt.Errorf("rootless needs no limits + no cgrouppath when no permission is granted for cgroups: %w", err)
}
Expand Down
2 changes: 1 addition & 1 deletion libcontainer/configs/validate/rootless.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ func rootlessEUIDMappings(config *configs.Config) error {
return errors.New("rootless container requires user namespaces")
}
// We only require mappings if we are not joining another userns.
if path := config.Namespaces.PathOf(configs.NEWUSER); path == "" {
if config.Namespaces.IsPrivate(configs.NEWUSER) {
if len(config.UIDMappings) == 0 {
return errors.New("rootless containers requires at least one UID mapping")
}
Expand Down
13 changes: 12 additions & 1 deletion libcontainer/process_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -580,7 +580,18 @@ func (p *initProcess) start() (retErr error) {
// cgroup. We don't need to worry about not doing this and not being root
// because we'd be using the rootless cgroup manager in that case.
if err := p.manager.Apply(p.pid()); err != nil {
return fmt.Errorf("unable to apply cgroup configuration: %w", err)
if errors.Is(err, cgroups.ErrRootless) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the purpose of warning, I think this PR LGTM.
But further more, I think we should serialize this to state.json, for example with a field name noCgroup, it is useful for doing runc kill.
Please see: #4395 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was thinking about it, too, but let's do improvement at a time, shall we?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we are trying to cut 1.2.0 for some time now, and this PR is a result of a recent development in the area (a regression described in #4394 (and fixed by #4395). As I noted in #4394 (comment), I'm OK with the fix in #4395, but it would be nice to also introduce a warning; this is what this PR does.

I think we can introduce noCgroup in runc 1.3. Feel free to open an issue about it so we won't forget.

// ErrRootless is to be ignored except when
// the container doesn't have private pidns.
if !p.config.Config.Namespaces.IsPrivate(configs.NEWPID) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to not put this condition in the previous if? Do you think it is more readable like this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that initially, but rolled it back later since changing the code like this complicates the review (the whole block changes instead of just one line).

I will add a separate commit that does it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad; I mixed this up with something else. Updated; PTAL @rata

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ughm, I just broke everything.

This is two if statements (and an else) because we want to

  • ignore ErrRootless;
  • except there's no private pidns, in which case we want a warning;
  • return all other errors as is.

No way to do that with a single if.

// TODO: make this an error in runc 1.3.
logrus.Warn("Creating a rootless container with no cgroup and no private pid namespace. " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about s/no cgroup/no devices cgroup/ ?

"Such configuration is strongly discouraged (as it is impossible to properly kill all container's processes) " +
"and will result in an error in a future runc version.")
}
} else {
return fmt.Errorf("unable to apply cgroup configuration: %w", err)
}
}
if p.intelRdtManager != nil {
if err := p.intelRdtManager.Apply(p.pid()); err != nil {
Expand Down
4 changes: 2 additions & 2 deletions libcontainer/specconv/spec_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -415,7 +415,7 @@ func CreateLibcontainerConfig(opts *CreateOpts) (*configs.Config, error) {
}
config.Namespaces.Add(t, ns.Path)
}
if config.Namespaces.Contains(configs.NEWNET) && config.Namespaces.PathOf(configs.NEWNET) == "" {
if config.Namespaces.IsPrivate(configs.NEWNET) {
config.Networks = []*configs.Network{
{
Type: "loopback",
Expand Down Expand Up @@ -481,7 +481,7 @@ func CreateLibcontainerConfig(opts *CreateOpts) (*configs.Config, error) {
// Only set it if the container will have its own cgroup
// namespace and the cgroupfs will be mounted read/write.
//
hasCgroupNS := config.Namespaces.Contains(configs.NEWCGROUP) && config.Namespaces.PathOf(configs.NEWCGROUP) == ""
hasCgroupNS := config.Namespaces.IsPrivate(configs.NEWCGROUP)
hasRwCgroupfs := false
if hasCgroupNS {
for _, m := range config.Mounts {
Expand Down
28 changes: 28 additions & 0 deletions tests/integration/create.bats
Original file line number Diff line number Diff line change
Expand Up @@ -81,3 +81,31 @@ function teardown() {

testcontainer test_busybox running
}

# https://github.com/opencontainers/runc/issues/4394#issuecomment-2334926257
@test "runc create [shared pidns + rootless]" {
# Remove pidns so it's shared with the host.
update_config ' .linux.namespaces -= [{"type": "pid"}]'
if [ $EUID -ne 0 ]; then
if rootless_cgroup; then
# Rootless containers have empty cgroup path by default.
set_cgroups_path
fi
# Can't mount real /proc when rootless + no pidns,
# so change it to a bind-mounted one from the host.
update_config ' .mounts |= map((select(.type == "proc")
| .type = "none"
| .source = "/proc"
| .options = ["rbind", "nosuid", "nodev", "noexec"]
) // .)'
fi

exp="Such configuration is strongly discouraged"
runc create --console-socket "$CONSOLE_SOCKET" test
[ "$status" -eq 0 ]
if [ $EUID -ne 0 ] && ! rootless_cgroup; then
[[ "$output" = *"$exp"* ]]
else
[[ "$output" != *"$exp"* ]]
fi
}
2 changes: 2 additions & 0 deletions tests/integration/helpers.bash
Original file line number Diff line number Diff line change
Expand Up @@ -727,6 +727,8 @@ function teardown_bundle() {
[ ! -v ROOT ] && return 0 # nothing to teardown

cd "$INTEGRATION_ROOT" || return
echo "--- teardown ---" >&2

teardown_recvtty
local ct
for ct in $(__runc list -q); do
Expand Down
Loading