Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/security/wrappers: use musl rather than glibc and explicitly unset insecure env vars #259039

Merged
merged 1 commit into from
Oct 6, 2023

Conversation

edef1c
Copy link
Member

@edef1c edef1c commented Oct 4, 2023

Description of changes

This mitigates CVE-2023-4911, crucially without a mass-rebuild.
See #258972 for the glibc upgrade that fixes the underlying vulnerability.

Things done

Copy link
Member

@Ma27 Ma27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea in general, but I'd like to give a few other folks a chance to take a look. I'd merge in a few days though if there's no response from someone else %)

@edef1c edef1c added 1.severity: security 9.needs: port to stable A PR needs a backport to the stable release. labels Oct 4, 2023
Copy link
Member

@RaitoBezarius RaitoBezarius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usage of pkgs.pkgsStatic is usually frowned upon inside nixpkgs (because it reimports nixpkgs I believe), but this provides us with a way to counteract the vulnerability quickly and maybe perfect this idea on the long run.

@RaitoBezarius RaitoBezarius requested review from alyssais, Ericson2314 and a user October 4, 2023 18:15
@edef1c
Copy link
Member Author

edef1c commented Oct 4, 2023

I think the musl / static linking part of the patch is a wise idea even in the absence of a current CVE, but I'd like to get this merged fairly quickly given that we currently have an unpatched local privesc vulnerability with ~public exploits.

@edef1c
Copy link
Member Author

edef1c commented Oct 4, 2023

Usage of pkgs.pkgsStatic is usually frowned upon inside nixpkgs (because it reimports nixpkgs I believe)

We already use pkgsStatic to build busybox-sandbox-shell, which is used by Nix itself, so all NixOS system evaluations already involve pkgsStatic.

Copy link
Member

@andir andir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is reasonable, even without the current exposure. The size difference likely doesn't matter for machines that are able to run NixOS and having less (runtime) dependencies for these things is always nice.

Given that there are already NixOS musl users I am not even concerned this might break use cases.

The thing that is a bit of an unknown to me are unknowns in terms of musl process setups etc.. Not sure if we might be opening up another hole by going this route but the code looks straight forward and I'd consider it a musl bug if we did.

@alyssais alyssais added 6.topic: static 6.topic: musl Running or building packages with musl libc labels Oct 4, 2023
@alyssais alyssais requested review from yu-re-ka and a team October 4, 2023 18:25
@mweinelt mweinelt requested a review from Mic92 October 4, 2023 18:28
@alois31
Copy link
Contributor

alois31 commented Oct 4, 2023

Does musl unset LD_PRELOAD and the like with AT_SECURE? Otherwise this will introduce a trivially exploitable privilege escalation in the capability wrapper.

@edef1c
Copy link
Member Author

edef1c commented Oct 4, 2023

The size difference likely doesn't matter for machines that are able to run NixOS and having less (runtime) dependencies for these things is always nice.

The wrappers are 62K each (vs 16K before), so that comes out to 1.3M extra across all 31(!) wrappers on my system. I don't think that really moves the needle.

The thing that is a bit of an unknown to me are unknowns in terms of musl process setups etc.. Not sure if we might be opening up another hole by going this route but the code looks straight forward and I'd consider it a musl bug if we did.

musl is specifically focused on security and simplicity, and the code is written accordingly. I'm pretty confident it's a better bet than glibc.

@edef1c
Copy link
Member Author

edef1c commented Oct 4, 2023

Does musl unset LD_PRELOAD and the like with AT_SECURE? Otherwise this will introduce a trivially exploitable privilege escalation in the capability wrapper.

It doesn't look like it. We probably want to cover the full set of variables listed in sysdeps/generic/unsecvars.h, which includes GLIBC_TUNABLES.

@risicle
Copy link
Contributor

risicle commented Oct 4, 2023

I'm not disputing the fix being applied here, but

musl is specifically focused on security and simplicity, and the code is written accordingly. I'm pretty confident it's a better bet than glibc.

I wouldn't be so hasty to come to broad conclusions like this. musl's "simplicity" also means it's missing security features such as an implementation of fortify hardening (and the external fortify-headers project that we use to provide basic support in musl/nixpkgs only has _FORTIFY_SOURCE=1 support). My impression when digging into musl has also been that it has had far fewer eyes on it. My stance would be neutral.

nixos/modules/security/wrappers/wrapper.c Outdated Show resolved Hide resolved
…set insecure env vars

This mitigates CVE-2023-4911, crucially without a mass-rebuild.

We drop insecure environment variables explicitly, including
glibc-specific ones, since musl doesn't do this by default.

Change-Id: I591a817e6d4575243937d9ccab51c23a96bed6f9
@RaitoBezarius
Copy link
Member

RaitoBezarius commented Oct 5, 2023 via email

@edef1c
Copy link
Member Author

edef1c commented Oct 5, 2023

  • All indications so far are that the only known exploit for this vulnerability does not impact NixOS or Nix built binaries, since we build roughly everything with an rpath (which Qualys has said they haven't found a way to make their exploit work against, since it relies on overwriting the default rpath value).
    • It's speculation, but not ungrounded, and I've yet to see evidence against it.
    • If there's another way to exploit this, by the time it gets found the patch will likely have made it to channels.

That's a fair point, and makes me feel much more neutral about waiting out the glibc staging-23.05 build without a backport.

This PR has not seen a lot of testing, it does things that are unusual (pkgsStatic usage - I don't think we have much precedent for this in the "base system"?) and that I could imagine to break in some weird configurations like cross builds, tier 3+ platforms, etc.

As mentioned upthread, we ship pkgsStatic.busybox-sandbox-shell as a dependency of pkgs.nix on all Linux systems except RISC-V (where it uses pkgs.busybox), so that particular part has seen plenty of testing.

Copy link
Contributor

@robryk robryk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is beneficial and don't see any issues after a semi-thorough read (I didn't try to go and see for myself what glibc does on startup that musl doesn't though).

Three potential future ideas:

  1. Ror most wrappers, we don't need to preserve ~any env vars. We could have a configuration option per wrapper that switches to a mode where only explicitly configured variables are preserved.
  2. Maybe we could avoid going through exec()? If we could, we could avoid the thrice-damned capability handling across exec() that prevents one from simulating file capabilities. The reason I'm doubtful this is possible is that I don't think that /proc/self/exe ever gets repointed.
  3. Maybe we could somehow actually simulate file capabilities?

nixos/modules/security/wrappers/wrapper.c Show resolved Hide resolved
@edef1c
Copy link
Member Author

edef1c commented Oct 5, 2023

MALLOC_* env vars explicitly, since those are legacy ways to set tunables relevant to memory allocation

@edef1c Why "legacy"? On e.g. gnu.org/software/libc/manual/html_node/Malloc-Tunable-Parameters.html it mentions nothing about them being legacy.

gnu.org/software/libc/manual/html_node/Memory-Allocation-Tunables.html describes every single one of these environment variable aliases as "superseded" by the tunables.

Tunable glibc.malloc.check
This tunable supersedes the MALLOC_CHECK_ environment variable and is identical in features.
Tunable glibc.malloc.top_pad
This tunable supersedes the MALLOC_TOP_PAD_ environment variable and is identical in features.
Tunable glibc.malloc.perturb
This tunable supersedes the MALLOC_PERTURB_ environment variable and is identical in features.
Tunable glibc.malloc.mmap_threshold
This tunable supersedes the MALLOC_MMAP_THRESHOLD_ environment variable and is identical in features.
Tunable glibc.malloc.trim_threshold
This tunable supersedes the MALLOC_TRIM_THRESHOLD_ environment variable and is identical in features.
Tunable glibc.malloc.mmap_max
This tunable supersedes the MALLOC_MMAP_MAX_ environment variable and is identical in features.
Tunable glibc.malloc.arena_test
This tunable supersedes the MALLOC_ARENA_TEST environment variable and is identical in features.
Tunable glibc.malloc.arena_max
This tunable supersedes the MALLOC_ARENA_MAX environment variable and is identical in features.
Tunable glibc.cpu.hwcap_mask
This tunable supersedes the LD_HWCAP_MASK environment variable and is identical in features.

It's not technically a hard guarantee, and certainly doesn't use the word "deprecated" anywhere, but I would be surprised if a new environment variable is proposed and accepted.

@edef1c
Copy link
Member Author

edef1c commented Oct 5, 2023

I agree this is beneficial and don't see any issues after a semi-thorough read (I didn't try to go and see for myself what glibc does on startup that musl doesn't though).

I've only looked at code that pays attention to __libc_enable_secure and/or AT_SECURE, but I didn't find anything significant that I haven't already listed in this thread. More eyeballs are definitely welcome, though.

Three potential future ideas:

  1. For most wrappers, we don't need to preserve ~any env vars. We could have a configuration option per wrapper that switches to a mode where only explicitly configured variables are preserved.

I'm very much in favour of this, and had been considering it myself. I'm happy to drop a patch for that soon.

Figuring out what env vars we need to whitelist for all of the programs we wrap generates a fair amount of long-tail work and will probably cause a bunch of small breakages, but I think it's worth giving a shot.

  1. Maybe we could avoid going through exec()? If we could, we could avoid the thrice-damned capability handling across exec() that prevents one from simulating file capabilities. The reason I'm doubtful this is possible is that I don't think that /proc/self/exe ever gets repointed.

The only mechanism I can think of here is a bit cursed: we could operate as a program interpreter ourselves, such that exec doesn't even come into the picture. It'd be a bit of work to make work across architectures, but it's not insurmountable.

The program interpreter is just a slightly unusual static binary in essence. It does have to do a little bit of ELF parsing, but the binaries we are wrapping are by definition already entrusted with the privileges the wrapper runs with, so that's not too concerning.

  1. Maybe we could somehow actually simulate file capabilities?

Quoting from myself in a sub-thread:

The kernel code that sets AT_SECURE sets it in three cases: UID/GID ≠ effective UID/GID, file capability bits (but not ambient caps, as discussed), and LSMs requesting it.

I don't have much experience with LSMs, but I think it would be very worthwhile to research either figuring out how to configure an existing LSM to cover this, or writing a small LSM of our own.

Having a single entry point for legitimate privilege escalation is a security boon in many ways, but the ambient capability case throws a bit of a wrench in the works. If we cover it with an LSM, we should be golden.

Relying on an LSM for this core a mechanism isn't ideal, since it'll silently fail if people are running unusual kernels (eg on embedded devices), but getting a bit more serious about making use of available kernel-side security facilities seems like a valuable general effort.

@alois31
Copy link
Contributor

alois31 commented Oct 6, 2023

* All indications so far are that the only known exploit for this vulnerability does not impact NixOS or Nix built binaries, since we build roughly everything with an rpath (which Qualys has said they haven't found a way to make their exploit work against, since it relies on overwriting the default rpath value).
  
  * It's speculation, but not ungrounded, and I've yet to see evidence against it.
  * If there's another way to exploit this, by the time it gets found the patch will likely have made it to channels.

FWIW (I already mentioned this in the security discussion channel), the advisory also contains this paragraph describing an alternative (and likely simpler) way to exploit this:

  • or replace the first GLIBC_TUNABLES with a GLIBC_TUNABLES that
    contains dangerous (SXID_ERASE) tunables, which were previously
    removed by parse_tunables() -- although this seems promising at first,
    exploiting such a replacement would require a SUID-root program that
    setuid(0)s and execve()s another program with a preserved environment
    (to process the dangerous GLIBC_TUNABLES as root, but without
    __libc_enable_secure).

What "SUID-root program that setuid(0)s and execve()s" really means is "program that doesn't drop privileges and executes something without triggering AT_SECURE". And our capability wrapper does exactly that.

Copy link
Contributor

@alois31 alois31 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it more, I think I like this now. It fixes a real issue, and it isn't obviously broken at least. In case unexpected problems do arise, musl can probably cleanly be swapped for a fixed glibc (and the UNSECURE_ENVVARS_TUNABLES part is probably good to stay in any case).

@delroth delroth merged commit e462c91 into NixOS:master Oct 6, 2023
20 checks passed
@github-actions
Copy link
Contributor

github-actions bot commented Oct 6, 2023

Backport failed for release-23.05, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally.

git fetch origin release-23.05
git worktree add -d .worktree/backport-259039-to-release-23.05 origin/release-23.05
cd .worktree/backport-259039-to-release-23.05
git checkout -b backport-259039-to-release-23.05
ancref=$(git merge-base 35502f30abc1b59a793926212f5dfcd907cd1fe6 09325d24b685a3ab5507dc906ab4019696d33d59)
git cherry-pick -x $ancref..09325d24b685a3ab5507dc906ab4019696d33d59

// Except for MALLOC_CHECK_ (which is marked SXID_ERASE), these are all
// marked SXID_IGNORE (ignored in secure mode), so even the glibc version
// of this wrapper would leave them intact.
#define UNSECURE_ENVVARS_TUNABLES \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is our process to review the list of new environment variables in future?
Since we no longer use glibc, this is now a manual task now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was already mentioned somewhere in this (long) PR thread, but: this hardcoded list is not expected to grow, since glibc has switched to controlling these things through GLIBC_TUNABLES instead. And other environment variables are grabbed directly from the glibc source (UNSECURE_ENVVARS).

@nh2
Copy link
Contributor

nh2 commented Oct 6, 2023

It's nixos-unstable

Ah, OK. I had understood the first sentence in the PR description

This mitigates CVE-2023-4911, crucially without a mass-rebuild.

at that a plan would be to roll this out also to stable users (thus "without a mass-rebuild").

gnu.org/software/libc/manual/html_node/Memory-Allocation-Tunables.html describes every single one of these environment variable aliases as "superseded" by the tunables.

@edef1c Thanks!

@lopsided98
Copy link
Contributor

I looks like this breaks armv7l-linux:

/nix/store/4lxj7pm6qa6fznja0bwk4c68rd90k6gn-armv7l-unknown-linux-musleabihf-binutils-2.40/bin/armv7l-unknown-linux-musleabihf-ld: /nix/store/cppxlcnd0anv12l2dy79cgpc2qwq3lyq-armv7l-unknown-linux-musleabihf-stage-final-gcc-12.3.0/lib/gcc/armv7l-unknown-linux-musleabihf/12.3.0/crtbeginT.o: relocation R_ARM_MOVW_ABS_NC against `a local symbol' can not be used when making a shared object; recompile with -fPIC
/nix/store/cppxlcnd0anv12l2dy79cgpc2qwq3lyq-armv7l-unknown-linux-musleabihf-stage-final-gcc-12.3.0/lib/gcc/armv7l-unknown-linux-musleabihf/12.3.0/crtbeginT.o:(.fini_array+0x0): dangerous relocation: unsupported relocation
/nix/store/cppxlcnd0anv12l2dy79cgpc2qwq3lyq-armv7l-unknown-linux-musleabihf-stage-final-gcc-12.3.0/lib/gcc/armv7l-unknown-linux-musleabihf/12.3.0/crtbeginT.o:(.init_array+0x0): dangerous relocation: unsupported relocation
collect2: error: ld returned 1 exit status

Interestingly, armv6l-linux is fine.

@lopsided98
Copy link
Contributor

lopsided98 commented Oct 7, 2023

Related: #115363, 76552e9, https://bugs.launchpad.net/ubuntu/+source/gcc-4.4/+bug/503448

Removing hardeningEnable = [ "pie" ] should fix this. PIE is already enabled by default with musl, except where it is broken.

I was curious as to whether aarch64 is broken at runtime with this PR, as discussed here: #114953 (comment). It turns out that the binaries are being unintentionally dynamically linked, but they appear to work fine.

On aarch64:

/nix/store/00jlpr14yzh1dshjrnss0ks5phcs2yk0-security-wrapper-aarch64-unknown-linux-musl/bin/security-wrapper: ELF 64-bit LSB pie executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /nix/store/zrfsmv0cimq285d22yb5hhdsp7n872b5-musl-static-aarch64-unknown-linux-musl-1.2.3/lib/ld-musl-aarch64.so.1, not stripped

On x86_64:

/nix/store/amqald7kz3jlv82vmi9lbxgp81w7wl9q-security-wrapper-x86_64-unknown-linux-musl/bin/security-wrapper: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped

@lopsided98
Copy link
Contributor

On armv6l, we get unintended dynamic linking like on aarch64, but we also get a segfault at runtime.

/nix/store/i1ycacmg5s9x1kw84jw9rf724rpb88sb-security-wrapper-armv6l-unknown-linux-musleabihf/bin/security-wrapper: ELF 32-bit LSB pie executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /nix/store/5vs5w64197gchjiygg5lvhhf9v5r8a13-musl-static-armv6l-unknown-linux-musleabihf-1.2.3/lib/ld-musl-armhf.so.1, not stripped

@edef1c
Copy link
Member Author

edef1c commented Oct 7, 2023

Related: #115363, 76552e9, bugs.launchpad.net/ubuntu/+source/gcc-4.4/+bug/503448

Removing hardeningEnable = [ "pie" ] should fix this. PIE is already enabled by default with musl, except where it is broken.

Static executables aren't actually relocated, so PIE flags don't end up super relevant. It would be nice to have ASLR for this code, but it's not a trivial undertaking.

I was curious as to whether aarch64 is broken at runtime with this PR, as discussed here: #114953 (comment). It turns out that the binaries are being unintentionally dynamically linked, but they appear to work fine.

On aarch64:

/nix/store/00jlpr14yzh1dshjrnss0ks5phcs2yk0-security-wrapper-aarch64-unknown-linux-musl/bin/security-wrapper: ELF 64-bit LSB pie executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /nix/store/zrfsmv0cimq285d22yb5hhdsp7n872b5-musl-static-aarch64-unknown-linux-musl-1.2.3/lib/ld-musl-aarch64.so.1, not stripped

On x86_64:

/nix/store/amqald7kz3jlv82vmi9lbxgp81w7wl9q-security-wrapper-x86_64-unknown-linux-musl/bin/security-wrapper: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped

Is this downstream of dynamically-linked musl work maybe? We do actually want to exclude the dynamic linker here.
cc @alyssais

@lopsided98
Copy link
Contributor

I'm not sure what work you are referring to, but I don't think the unintentional dynamic linking is anything new; the same behavior was described in #114953 (comment) in 2021 as a side effect of the PIE hardening.

@lopsided98
Copy link
Contributor

Fix in #259509

@edef1c
Copy link
Member Author

edef1c commented Oct 7, 2023

I'm not sure what work you are referring to, but I don't think the unintentional dynamic linking is anything new; the same behavior was described in #114953 (comment) in 2021 as a side effect of the PIE hardening.

Non-PT_DYN executables are definitionally not relocated, so position-independence of the code doesn't really mean anything. Even on x86-64, pkgsStatic doesn't produce PT_DYN executables. They are all loaded at 0x400000.

We should probably figure out how to make PIE work on static musl, but it's currently meaningless.

@risicle
Copy link
Contributor

risicle commented Oct 7, 2023

In an otherwise-abandoned branch of mine I made bintools-wrapper add -no-dynamic-linker when its detected linkType was not dynamic to prevent similar problems with LLVM-built static binaries (fixed by other means now):

risicle@44555fb

@SuperSandro2000
Copy link
Member

Any curious in the size difference which is IMO whatever:

security-wrapper: ε → ∅, -265.9 KiB
security-wrapper-arp: ε → ∅, -16.6 KiB
security-wrapper-arp-scan-x86_64-unknown-linux: ∅ → ε, +62.9 KiB
security-wrapper-chsh-x86_64-unknown-linux: ∅ → ε, +62.9 KiB
security-wrapper-dbus-daemon-launch: ε → ∅, -16.6 KiB
security-wrapper-dbus-daemon-launch-helper-x86_64-unknown-linux: ∅ → ε, +62.9 KiB
security-wrapper-fusermount-x86_64-unknown-linux: ∅ → ε, +62.9 KiB
security-wrapper-fusermount3: ε → ∅, -16.6 KiB
security-wrapper-fusermount3-x86_64-unknown-linux: ∅ → ε, +62.9 KiB
security-wrapper-iotop-x86_64-unknown-linux: ∅ → ε, +62.9 KiB
security-wrapper-mount-x86_64-unknown-linux: ∅ → ε, +62.9 KiB
security-wrapper-mtr: ε → ∅, -16.6 KiB
security-wrapper-mtr-packet-x86_64-unknown-linux: ∅ → ε, +62.9 KiB
security-wrapper-newgidmap-x86_64-unknown-linux: ∅ → ε, +62.9 KiB
security-wrapper-newgrp-x86_64-unknown-linux: ∅ → ε, +62.9 KiB
security-wrapper-newuidmap-x86_64-unknown-linux: ∅ → ε, +62.9 KiB
security-wrapper-passwd-x86_64-unknown-linux: ∅ → ε, +62.9 KiB
security-wrapper-ping-x86_64-unknown-linux: ∅ → ε, +62.9 KiB
security-wrapper-plocate-x86_64-unknown-linux: ∅ → ε, +62.9 KiB
security-wrapper-sg-x86_64-unknown-linux: ∅ → ε, +62.9 KiB
security-wrapper-su-x86_64-unknown-linux: ∅ → ε, +62.9 KiB
security-wrapper-sudo-x86_64-unknown-linux: ∅ → ε, +62.9 KiB
security-wrapper-sudoedit-x86_64-unknown-linux: ∅ → ε, +62.9 KiB
security-wrapper-tcpdump-x86_64-unknown-linux: ∅ → ε, +62.9 KiB
security-wrapper-umount-x86_64-unknown-linux: ∅ → ε, +62.9 KiB
security-wrapper-unix_chkpwd: ε → ∅, -16.6 KiB
security-wrapper-unix_chkpwd-x86_64-unknown-linux: ∅ → ε, +62.9 KiB

but the names of the wrappers are now rather long and ugly because we are doing cross compile but that's only cosmetic.

@SuperSandro2000
Copy link
Member

I just noticed on a friends laptop that he needed to download the final GCC for musl which is ~350 MB just to build the wrappers which is suboptimal and a bit wasteful.

@delroth
Copy link
Contributor

delroth commented Oct 12, 2023

I just noticed on a friends laptop that he needed to download the final GCC for musl which is ~350 MB just to build the wrappers which is suboptimal and a bit wasteful.

When we had that discussion a few months back on whether it was fine to require a compiler to build custom wrapper binaries for each wrapper, the assumption was that this shouldn't happen in most cases because wrappers should get cached due to being built for NixOS tests (running on Hydra) already. I'm kind of surprised that this isn't the case. Which wrapper is causing this? It should get tests added.

@SuperSandro2000
Copy link
Member

I can't reproduce it anymore but I think it could be that those wrappers are not a blocker for the channel advancement and then if you are an early bird, they are not cached, yet.

@Mic92
Copy link
Member

Mic92 commented Oct 14, 2023

We could also get rid of musl and use nolibc from the linux kernel :), just like I did for nix-ld. There is also still https://github.com/NixOS/nixpkgs/pull/201536/files which doesn't require the setuid binary to be recompiled at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.