Skip to content

Commit

Permalink
Better support for rootless containers (#636)
Browse files Browse the repository at this point in the history
# Short summary

Running containers under rootless docker/podman lets us:

- avoid permission issues in mounted volumes without any extra run
setting (root in container == user at the host)
- avoid granting privileges to host users, enhancing system security
- simplify container images (just run them with root, it's safe because
it's actually the user without extra privileges!)

For backwards compatibility, this pull request assumes users still don't
run rootless docker by default, so we require users running rootless
containers to define `-e RUNROOTLESS=true`. Until a method is devised to
detect if a container is running rootless or not from within the
container.

- https://docs.docker.com/engine/security/rootless/
-
https://github.com/containers/podman/blob/main/docs/tutorials/rootless_tutorial.md

In this pull request, I modify the userconf script so if we are running
rootless `-e RUNROOTLESS=true` we do not create users or change sudoers
or do any of those things. With this change, we can run rstudio server
under an unprivileged user without issues.

```
mkdir $HOME/work
podman run -ti -e PASSWORD=helloworld -e RUNROOTLESS=true -v $HOME/work:/root/work -p 10000:8787 <img-name>
```

Visit localhost:10000 and login using root and helloworld.

Thanks for considering merging this.

## Alternative

As an alternative solution we could have different images: Something
like `rocker/rstudio-rootless` or `rocker-rootless/rstudio`


```
FROM rocker/rstudio
COPY init_userconf.sh /etc/cont-init.d/02_userconf
```

But I think it is easier to just merge this


# Historic background

Here is some longer story of why things are the way they are, to bring
some context to all this user mess that exists in docker images. It may
not be fully accurate, but good enough

### Stage 1: We are root

- Docker is a client-server application.
- The docker daemon runs as the root user. 
- docker users need to run docker commands under root or using `sudo`
(e.g. `sudo docker run...`).

This situation is problematic because:
- docker users have root access to the host
- docker images may create/modify files as root in the host (even
accidentally if some host directory is mounted!)
- files created by docker are owned by the root user, this easily
becomes a permission mess
- There are audit limitations, since all docker users connect to the
same docker daemon typically without authentication. We don't know who
does what.

People wants to use docker so much that a `docker` user group is
created, and all users in that `docker` group do not need to type `sudo`
to run docker anymore, although effectively it is as if they did. The
risk is still there, only hidden.

### Stage 2: Images drop root privileges

Docker is a complex piece of software that relies on namespaces and
control groups, features of the linux kernel that are under heavy
development.

Therefore the fastest and easiest solution to address some of these
issues comes from the image builders. The container starts running as
root, but image builders following best-practices drop those permissions
as soon as possible and read environment variables set when the
container is created so users can choose the user id they would like to
be used to create files, avoiding the file permission mess.

This depends on the good-will of the image builder, but "works".

This adds a lot of complexity, because images now have to consider
multiple users and permissions (allow root inside the container for
apt-get install, use sudoers...)

### Stage 3: Run with `docker run --user`

Docker allows to specify the user the image will run as. The Docker
daemon runs as root, but the container is started as running with a user
id, and that user in the container typically does not have root
privileges anymore.

Docker here avoids file permission issues, but at some cost. Since now
the image starting scripts do not have root access in the container,
allowing for apt-get inside the container becomes far more tricky.

To my knowledge, this option does not get a lot of adoption in rocker
and jupyter notebook images.

### Stage 4: User namespaces  `docker run --user-ns`

The linux kernel starts having support for user namespaces.

Basically we can map user ids in the container to a range of user ids in
the host. Depending on how this is used, this can lead to files created
by the image not being owned by root anymore, but by a super high user
ID.

To my knowledge, this option does not get a lot of adoption in rocker
and jupyter notebooks.

### Stage 5: Rootless docker

Enough namespace and cgroup solutions exist in the kernel for the docker
daemon to be able to run containers without root permissions.

Running the docker daemon as root is a security risk, and it is also an
auditability issue, so this becomes an actual better-designed solution
to the permission problem. Now each user can run its own docker daemon.
Since the docker daemon runs as the user, it does not have any special
permissions, no damage to the host can be made.

Therefore we do not need images to drop privileges or follow any best
practices to be responsible, since they are not allowed to do any damage
by default anymore. Things can be simple again! However we must still be
backwards compatible with those who have or want to use docker as root.

Hi **podman**! Podman was designed to run rootless by default, and maybe
even was able to do so before docker (I don't really know). podman does
not even need a daemon, although podman supports a daemon to increase
compatibility with docker.

How does this work? Using user namespaces in a more transparent way:

- alice, with user id 1000, has a rootless docker daemon running, so she
can run docker daemons without any root privilege.
- alice creates a container (podman run..., docker run...), mounting
some directories she has access to (`--volume`) without caring for
permissions or user ids
- docker/podman creates the container and runs the entrypoint. The
entrypoint seems to run as `root` from within the container, but from
the host it appears to run as `alice`. If the entrypoint scripts do not
do anything weird with the users, the entrypoint and commands that run
afterwards run as well as root/alice. Files created on the mounted
volume appear to be owned by alice. If alice tries to mount a volume she
can't write on (e.g. --volume /sbin:/somewhere) the root/alice user in
the container won't be able to write on it, because alice does not have
permissions to do that.

That's all we want and need! 

But... backwards compatiblity!

Until someone finds a convention to determine if a container is running
under rootless docker, we, image builders, can't tell if we should drop
privileges or simply use them.

So, I would like to ask alice to set an environment variable to tell me
if she is running rootless, so I can just use the root/alice user
without a care for setting up users and permissions and sudoers. I'll
have to ask alice to use an environment variable for now...
  • Loading branch information
zeehio authored May 13, 2023
1 parent c5bc84a commit 8ab4e7d
Showing 1 changed file with 124 additions and 17 deletions.
141 changes: 124 additions & 17 deletions scripts/init_userconf.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,98 @@ ROOT=${ROOT:=FALSE}
UMASK=${UMASK:=022}
LANG=${LANG:=en_US.UTF-8}
TZ=${TZ:=Etc/UTC}
RUNROOTLESS=${RUNROOTLESS:=auto}

if [ "${RUNROOTLESS}" = "auto" ]; then
RUNROOTLESS=$(grep 4294967295 /proc/self/uid_map >/dev/null && echo "false" || echo "true")
fi

USERHOME="/home/${USER}"

if [ "${RUNROOTLESS}" = "true" ]; then
printf "Assuming the container runs under rootless mode\n"
printf "Under rootless mode,\n"
printf " - You will log in using 'root' as user\n"
printf " - You will have root privileges within the container (e.g. apt)\n"
printf " - The files you create as root on mounted volumes will appear at the host as owned by the user who started the container\n"
printf " - You can't modify host files you don't have permission to\n"
printf " - You should NOT run in RUNROOTLESS=true if you are using the container with privileges (e.g. sudo docker run... or sudo podman run...)\n"
# The container was started asking to login as the root user.
# This is a good approach when running docker or podman rootless
# https://docs.docker.com/engine/security/rootless/
#
# When running docker rootless or podman rootless, the root user in
# the container has the capabilities of the actual host user. Nothing else.
#
# All files modified inside the container by the root user that are mapped
# to the host will appear in the host as modified by the user who runs the
# container. However from inside the container they appear to be modified by
# root.
#
# So, the user can run apt-get as the root user inside the container. No
# need for handling sudoers, since to the container the user is root.
#
# Higher user ids in the container (e.g. 1000) get mapped to very high user
# ids at the host. We don't need that and it just confuses things
USER="root"
USERID=0
GROUPID=0
USERHOME="/root"

# Keep all groups that have been set:
# When running rootless podman, podman may set the groups of the host user
# to the process running in the container with the option
# podman run --group-add keep-groups
#
# This option has the caveat that the GIDs which have not been mapped to
# the container in the namespace will appear as the overflow_gid (65534).
#
# While this process has the GID assigned (and therefore it has the
# privileges granted by that GID, this process cannot internally refer
# to that GID, because it is not mapped, and it appears as nobody/nogroup.
#
# This lack of mapping becomes a problem when we need to be able
# to assign those same groups to the processes created when a user logs
# in through the web interface. There, we are not able to setgroups() as
# podman did when the initial process in the container was started.
#
# What can we do?
# A solution goes through a sysadmin in the host allowing users to
# impersonate the target GID in /etc/subgid.
#
# For instance, if you have a "university_data" group, that has GID 2000
# and you have PhD students "alice" and "bob" who are in that group, you will
# need to have an additional entry in /etc/subgid for each of them:
# alice:2000:1
# bob:2000:1
#
# That entry reads as
# > Grant {alice/bob} the ability to become GID 2000.
#
# Those entries should be **additional** to the already existing entries that
# grant a big number of unused GIDs.
#
# Then use `podman system migrate` to refresh podman configuration.
#
# Podman will then be able to see those groups, although unfortunately
# the group name in the container will not be "university_data" but it will
# instead look like "adm" or "sys" or "bin".
#
# I'm trying to suggest an improvement to podman to address this a bit better
# at:
# https://github.com/containers/podman/issues/18333
#
ROOT_IN_GROUPS="$(id -G)"
OVERFLOWGID=$(cat "/proc/sys/kernel/overflowgid")
for g in ${ROOT_IN_GROUPS}; do
if [ "$g" -eq 0 ] || [ "$g" -eq "${OVERFLOWGID}" ]; then
# 0 is already our GID
# 65534 is nogroup (the overflow_gid)
continue
fi
usermod -aG "$g" "${USER}"
done
fi

if [[ ${DISABLE_AUTH,,} == "true" ]]; then
cp /etc/rstudio/disable_auth_rserver.conf /etc/rstudio/rserver.conf
Expand All @@ -29,18 +121,28 @@ elif [ -z "$PASSWORD" ]; then
printf "\n\n"
fi

if [ "$USERID" -lt 1000 ]; then # Probably a macOS user, https://github.com/rocker-org/rocker/issues/205
if [ "${RUNROOTLESS}" = "true" ]; then
check_user_id=$(grep -F "auth-minimum-user-id" /etc/rstudio/rserver.conf)
if [[ -n $check_user_id ]]; then
echo "minimum authorised user already exists in /etc/rstudio/rserver.conf: $check_user_id"
echo "RUNROOTLESS=true mode requires setting minimum authorised user to 0. Exiting"
exit 1
else
echo "setting minimum authorised user to 0 (RUNROOTLESS=true)"
echo auth-minimum-user-id=0 >>/etc/rstudio/rserver.conf
fi
elif [ "$USERID" -lt 1000 ]; then # Probably a macOS user, https://github.com/rocker-org/rocker/issues/205
echo "$USERID is less than 1000"
check_user_id=$(grep -F "auth-minimum-user-id" /etc/rstudio/rserver.conf)
if [[ -n $check_user_id ]]; then
echo "minumum authorised user already exists in /etc/rstudio/rserver.conf: $check_user_id"
echo "minimum authorised user already exists in /etc/rstudio/rserver.conf: $check_user_id"
else
echo "setting minumum authorised user to 499"
echo "setting minimum authorised user to 499"
echo auth-minimum-user-id=499 >>/etc/rstudio/rserver.conf
fi
fi

if [ "$USER" != "$DEFAULT_USER" ]; then
if [ "${RUNROOTLESS}" != "true" ] && [ "$USER" != "$DEFAULT_USER" ]; then
printf "\n\n"
tput bold
printf "Settings by \e[31m\`-e USER=<new username>\`\e[39m is now deprecated and will be removed in the future.\n"
Expand All @@ -49,53 +151,58 @@ if [ "$USER" != "$DEFAULT_USER" ]; then
printf "\n\n"
fi

if [ "$USERID" -ne 1000 ]; then ## Configure user with a different USERID if requested.
if [ "${RUNROOTLESS}" = "true" ]; then
echo "deleting the default user ($DEFAULT_USER) since it is not needed."
userdel "$DEFAULT_USER"
elif [ "$USERID" -ne 1000 ]; then ## Configure user with a different USERID if requested.
echo "deleting the default user"
userdel "$DEFAULT_USER"
echo "creating new $USER with UID $USERID"
useradd -m "$USER" -u $USERID
mkdir -p /home/"$USER"
chown -R "$USER" /home/"$USER"
useradd -m "$USER" -u "$USERID"
mkdir -p "${USERHOME}"
chown -R "$USER" "${USERHOME}"
usermod -a -G staff "$USER"
elif [ "$USER" != "$DEFAULT_USER" ]; then
## cannot move home folder when it's a shared volume, have to copy and change permissions instead
cp -r /home/"$DEFAULT_USER" /home/"$USER"
cp -r /home/"$DEFAULT_USER" "${USERHOME}"
## RENAME the user
usermod -l "$USER" -d /home/"$USER" "$DEFAULT_USER"
groupmod -n "$USER" "$DEFAULT_USER"
usermod -a -G staff "$USER"
chown -R "$USER":"$USER" /home/"$USER"
chown -R "$USER":"$USER" "${USERHOME}"
echo "USER is now $USER"
fi

if [ "$GROUPID" -ne 1000 ]; then ## Configure the primary GID (whether rstudio or $USER) with a different GROUPID if requested.
if [ "${RUNROOTLESS}" != "true" ] && [ "$GROUPID" -ne 1000 ]; then ## Configure the primary GID (whether rstudio or $USER) with a different GROUPID if requested.
echo "Modifying primary group $(id "${USER}" -g -n)"
groupmod -o -g $GROUPID "$(id "${USER}" -g -n)"
groupmod -o -g "$GROUPID" "$(id "${USER}" -g -n)"
echo "Primary group ID is now custom_group $GROUPID"
fi

## Add a password to user
echo "$USER:$PASSWORD" | chpasswd

# Use Env flag to know if user should be added to sudoers
if [[ ${ROOT,,} == "true" ]]; then
if [ "${RUNROOTLESS}" = "true" ]; then
echo "No sudoers changes needed when running rootless"
elif [[ ${ROOT,,} == "true" ]]; then
adduser "$USER" sudo && echo '%sudo ALL=(ALL) NOPASSWD:ALL' >>/etc/sudoers
echo "$USER added to sudoers"
fi

## Change Umask value if desired
if [ "$UMASK" -ne 022 ]; then
echo "server-set-umask=false" >>/etc/rstudio/rserver.conf
echo "Sys.umask(mode=$UMASK)" >>/home/"$USER"/.Rprofile
echo "Sys.umask(mode=$UMASK)" >>"${USERHOME}"/.Rprofile
fi

## Next one for timezone setup
if [ "$TZ" != "Etc/UTC" ]; then
ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ >/etc/timezone
ln -snf /usr/share/zoneinfo/"$TZ" /etc/localtime && echo "$TZ" >/etc/timezone
fi

## Update Locale if needed
if [ "$LANG" != "en_US.UTF-8" ]; then
/usr/sbin/locale-gen --lang $LANG
/usr/sbin/update-locale --reset LANG=$LANG
/usr/sbin/locale-gen --lang "$LANG"
/usr/sbin/update-locale --reset LANG="$LANG"
fi

0 comments on commit 8ab4e7d

Please sign in to comment.