You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At the moment /var (also known as EPHEMERAL partition) doesn’t have any specific structure: users are allowed to create mount points, put user files at random locations under /var.
For the pod hostPath mount to work properly with all the features supported by the kubelet, mount path should be available in the kubelet mount namespace same way as in the host namespace. This requires manual and non-obvious configuration.
For the external mounts (e.g. NFS) to work properly if mounting is done from the kubelet, the mount path should be done in the kubelet.
Talos doesn’t offer a way to put user files really ephemeral (i.e. using tmpfs), so that reboot is enough to clean up the state.
Talos doesn’t support full reconciliation for machine.files key, as contents of the /var are not known, and the effect of removing a value from machine.files is not clear.
There’s no way to remove parts of the /var (e.g. if some directory was created by mistake).
Some critical or system-important parts of /var are not protected from simple mistakes (e.g. creating a wrong hostPath mount under the etcd data directory).
What’s in /var?
lib/etcd - etcd data directory (only controlplane nodes)
system/overlays - Talos internal path for overlayfs mounts (probably almost all can be migrated to tmpfs with small exceptions)
log - pod logs, API server audit logs, etc.
lib/containerd - CRI containerd state (container workspaces)
lib/kubelet - kubelet state
lib/cni - CNI state (???)
run - various ephemeral things (should be in tmpfs)
Proposal
etcd
Make sure etcd data directory is only accessible by etcd itself (and, Talos itself for the purposes of backup/restore). No other workload should be able to access the etcd data ever.
E.g. we could use SELinux, which will protect etcd from other workloads while it can also protect workloads from accessing etcd.
kubelet
Mostly same thing as etcd, we should look into protecting data directory from other workloads. As kubelet makes a lot of random access, it’s hard to contain kubelet itself from accessing other directories.
Logs
We can look into making sure other workloads have read-only access to the logs, while kubelet (?) can write the logs.
run
Should we make this tmpfs (if not already?)
containerd
Not much we can offer, as workloads write to the container scratch space.
overlays
This is a Talos-specific location, and we shouldn’t allow random writes there (overlayfs upperdir, workdir). We should look into minimizing the overlays on /var (we could replace with overlays on tmpfs when it makes sense).
/var/mnt
Introduce new directory (naming TBD) which serves a root mount point for:
hostPath mounts, local-path-provisioner default path, etc.
mounts done by the kubelet container (e.g. NFS mounts)
This path is mounted as rshared into the kubelet container, so that mounts both ways (from the kubelet to the host, from the host to the kubelet) are visible.
Users are supposed only to use this hostPath for such mounts.
Questions:
how do we enforce it? (e.g. SELinux)
how can we make third-party software to use this path? (e.g. OpenEBS, Ceph/Rook, etc.)
this should be documented clearly in the Talos docs
machine.files
We need to split it into the usecases for this feature:
static pods (now can be done via .machine.pods better way), deprecated now
creating random directories (I guess we don’t need it now, given that /var/mnt is mounted to the kubelet?)
dropping configuration for static pods, system extensions - we should better use tmpfs location (easier to prune with reboot)
CRI custom configuration (doesn’t go into /var)
In general, machine.files should work on top of the controller.
API to Remove File(s)
Should be restricted to work under /var/mnt only.
Benefits
Better security model, protecting services from other workloads (e.g. etcd), and also making sure a service doesn’t have access to the data it shouldn’t have access to.
Better structure for the users (/var/mnt)
Controller-based machine.files
Pruning the node can be selective - e.g. if the containerd state gets corrupted on a single controlplane node, it can be pruned without touching etcd data.
Selective pruning of the node, e.g. on upgrade we can prune containerd state and overlay mounts, but keep /var/mnt for hostPath persistence.
The text was updated successfully, but these errors were encountered:
Protecting stuff is one thing another problem right now is that they all share the same filesystem. That means if we use the local path provisioner and consume all the space, etcd will crash due to out of disk space errors. In my old k3s setup I used to have lvm volumes for eauch of the consumers like etcd, longhorn, local-path and so on....
Tasks
Problem Statement
At the moment
/var
(also known asEPHEMERAL
partition) doesn’t have any specific structure: users are allowed to create mount points, put user files at random locations under/var
.For the pod
hostPath
mount to work properly with all the features supported by thekubelet
, mount path should be available in thekubelet
mount namespace same way as in the host namespace. This requires manual and non-obvious configuration.For the external mounts (e.g. NFS) to work properly if mounting is done from the
kubelet
, the mount path should be done in thekubelet
.Talos doesn’t offer a way to put user files really ephemeral (i.e. using
tmpfs
), so that reboot is enough to clean up the state.Talos doesn’t support full reconciliation for
machine.files
key, as contents of the/var
are not known, and the effect of removing a value frommachine.files
is not clear.There’s no way to remove parts of the
/var
(e.g. if some directory was created by mistake).Some critical or system-important parts of
/var
are not protected from simple mistakes (e.g. creating a wronghostPath
mount under theetcd
data directory).What’s in
/var
?lib/etcd
- etcd data directory (only controlplane nodes)system/overlays
- Talos internal path foroverlayfs
mounts (probably almost all can be migrated totmpfs
with small exceptions)log
- pod logs, API server audit logs, etc.lib/containerd
- CRI containerd state (container workspaces)lib/kubelet
- kubelet statelib/cni
- CNI state (???)run
- various ephemeral things (should be intmpfs
)Proposal
etcd
Make sure
etcd
data directory is only accessible byetcd
itself (and, Talos itself for the purposes of backup/restore). No other workload should be able to access theetcd
data ever.E.g. we could use
SELinux
, which will protectetcd
from other workloads while it can also protect workloads from accessingetcd
.kubelet
Mostly same thing as
etcd
, we should look into protecting data directory from other workloads. Askubelet
makes a lot of random access, it’s hard to containkubelet
itself from accessing other directories.Logs
We can look into making sure other workloads have read-only access to the logs, while
kubelet
(?) can write the logs.run
Should we make this
tmpfs
(if not already?)containerd
Not much we can offer, as workloads write to the container scratch space.
overlays
This is a Talos-specific location, and we shouldn’t allow random writes there (
overlayfs
upperdir, workdir). We should look into minimizing the overlays on/var
(we could replace with overlays on tmpfs when it makes sense)./var/mnt
Introduce new directory (naming TBD) which serves a root mount point for:
hostPath
mounts,local-path-provisioner
default path, etc.This path is mounted as
rshared
into thekubelet
container, so that mounts both ways (from the kubelet to the host, from the host to the kubelet) are visible.Users are supposed only to use this
hostPath
for such mounts.Questions:
machine.files
We need to split it into the usecases for this feature:
.machine.pods
better way), deprecated now/var/mnt
is mounted to the kubelet?)tmpfs
location (easier to prune with reboot)/var
)In general,
machine.files
should work on top of the controller.API to Remove File(s)
Should be restricted to work under
/var/mnt
only.Benefits
etcd
), and also making sure a service doesn’t have access to the data it shouldn’t have access to./var/mnt
)machine.files
containerd
state gets corrupted on a single controlplane node, it can be pruned without touchingetcd
data.upgrade
we can prunecontainerd
state and overlay mounts, but keep/var/mnt
for hostPath persistence.The text was updated successfully, but these errors were encountered: