From 12f2bc6e8c182e0ac0f9b256907f839da8ceb162 Mon Sep 17 00:00:00 2001 From: Remi Gacogne Date: Tue, 4 Jun 2024 16:24:13 +0200 Subject: [PATCH] dnsdist: Document that eBPF socket filtering requires `CAP_SYS_ADMIN` We used to be able to use only `CAP_BPF` since kernel 5.8, but the eBPF verifier has been made more strict a few versions later and we now require `CAP_SYS_ADMIN` again. --- pdns/dnsdistdist/docs/advanced/ebpf.rst | 9 ++++----- pdns/dnsdistdist/docs/reference/config.rst | 2 +- 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/pdns/dnsdistdist/docs/advanced/ebpf.rst b/pdns/dnsdistdist/docs/advanced/ebpf.rst index 2905b95e96c9..4efe883a1f86 100644 --- a/pdns/dnsdistdist/docs/advanced/ebpf.rst +++ b/pdns/dnsdistdist/docs/advanced/ebpf.rst @@ -1,16 +1,16 @@ eBPF Socket Filtering ===================== -:program:`dnsdist` can use `eBPF `_ socket filtering on recent Linux kernels (4.1+) built with eBPF support (``CONFIG_BPF``, ``CONFIG_BPF_SYSCALL``, ideally ``CONFIG_BPF_JIT``). It requires dnsdist to have the ``CAP_SYS_ADMIN`` capabilities at startup, or the more restrictive ``CAP_BPF`` one since Linux 5.8. +:program:`dnsdist` can use `eBPF `_ socket filtering on recent Linux kernels (4.1+) built with eBPF support (``CONFIG_BPF``, ``CONFIG_BPF_SYSCALL``, ideally ``CONFIG_BPF_JIT``). It requires dnsdist to have the ``CAP_SYS_ADMIN`` capabilities at startup. .. note:: - To retain the required capability, ``CAP_SYS_ADMIN`` or ``CAP_BPF`` depending on the Linux kernel version, it is necessary to call :func:`addCapabilitiesToRetain` during startup, as :program:`dnsdist` drops capabilities after startup. + To retain the required capability, ``CAP_SYS_ADMIN``, it is necessary to call :func:`addCapabilitiesToRetain` during startup, as :program:`dnsdist` drops capabilities after startup. .. note:: - eBPF can be used by unprivileged users lacking the ``CAP_SYS_ADMIN`` (or ``CAP_BPF``) capability on some kernels, depending on the value of the ``kernel.unprivileged_bpf_disabled`` sysctl. Since 5.15 that kernel build setting ``BPF_UNPRIV_DEFAULT_OFF`` is enabled by default, which prevents unprivileged users from using eBPF. + eBPF can be used by unprivileged users lacking the ``CAP_SYS_ADMIN`` capability on some kernels, depending on the value of the ``kernel.unprivileged_bpf_disabled`` sysctl. Since 5.15 that kernel build setting ``BPF_UNPRIV_DEFAULT_OFF`` is enabled by default, which prevents unprivileged users from using eBPF. .. note:: - ``AppArmor`` users might need to update their policy to allow dnsdist to keep the ``CAP_SYS_ADMIN`` (or ``CAP_BPF``) capability. Adding a ``capability bpf,`` (for ``CAP_BPF``) line to the policy file is usually enough. + ``AppArmor`` users might need to update their policy to allow dnsdist to keep the ``CAP_SYS_ADMIN`` capability. Adding a ``capability sys_admin,`` line to the policy file is usually enough. .. note:: In addition to keeping the correct capability, large maps might require an increase of ``RLIMIT_MEMLOCK``, as mentioned below. @@ -129,4 +129,3 @@ The first, legacy format is still used because of the limitations of eBPF socket XDP programs are more powerful than eBPF socket filtering ones as they are not limited to accepting or denying a packet, but can immediately craft and send an answer. They are also executed a bit earlier in the kernel networking path so can provide better performance. A sample program using the maps populated by dnsdist in an external XDP program can be found in the `contrib/ directory of our git repository `__. That program supports answering with a TC=1 response instead of simply dropping the packet. - diff --git a/pdns/dnsdistdist/docs/reference/config.rst b/pdns/dnsdistdist/docs/reference/config.rst index 0beb97fbbca6..7473624276e6 100644 --- a/pdns/dnsdistdist/docs/reference/config.rst +++ b/pdns/dnsdistdist/docs/reference/config.rst @@ -42,7 +42,7 @@ Global configuration .. versionadded:: 1.7.0 Accept a Linux capability as a string, or a list of these, to retain after startup so that privileged operations can still be performed at runtime. - Keeping ``CAP_BPF`` on kernel 5.8+ for example allows loading eBPF programs and altering eBPF maps at runtime even if the ``kernel.unprivileged_bpf_disabled`` sysctl is set. + Keeping ``CAP_SYS_ADMIN`` on kernel 5.8+ for example allows loading eBPF programs and altering eBPF maps at runtime even if the ``kernel.unprivileged_bpf_disabled`` sysctl is set. Note that this does not grant the capabilities to the process, doing so might be done by running it as root which we don't advise, or by adding capabilities via the systemd unit file, for example. Please also be aware that switching to a different user via ``--uid`` will still drop all capabilities.