Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] kdump LUKS support #10

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

[RFC] kdump LUKS support #10

wants to merge 4 commits into from

Conversation

coiby
Copy link
Member

@coiby coiby commented Jun 8, 2024

LUKS is the standard for Linux disk encryption. Many users choose LUKS
and in some use cases like Confidential VM it's mandated. With kdump
enabled, when the 1st kernel crashes, the system could boot into the
kdump/crash kernel and dump the memory image i.e. /proc/vmcore to a
specified target. Currently, when dumping vmcore to a LUKS
encrypted device, there are two problems,

  • Kdump kernel may not be able to decrypt the LUKS partition. For some
    machines, a system administrator may not have a chance to enter the
    password to decrypt the device in kdump initramfs after the 1st kernel
    crashes; For cloud confidential VMs, depending on the policy the
    kdump kernel may not be able to unseal the keys with TPM and the
    console virtual keyboard is untrusted.

  • LUKS2 by default use the memory-hard Argon2 key derivation function
    which is quite memory-consuming compared to the limited memory reserved
    for kdump. Take Fedora example, by default, only 256M is reserved for
    systems having memory between 4G-64G. With LUKS enabled, ~1300M needs
    to be reserved for kdump. Note if the memory reserved for kdump can't
    be used by 1st kernel i.e. an user sees ~1300M memory missing in the
    1st kernel.

Besides users (at least for Fedora) usually expect kdump to work out of
the box i.e. no manual password input is needed. And it doesn't make
sense to derivate the keys again in kdump kernel which seems to be
redundant work.

Based on the new kernel feature that dm-crypt keys can persist for the
kdump kernel [1], this patch which is adapted from [2]

  1. ask the 1st kernel to save a copy of the LUKS volume keys
  2. ask the kdump kernel to add the copy of the LUKS volume keys to
    specified keyring and then use --volume-key-keyring the unlock the
    LUKS device.

[1] https://github.com/coiby/linux/blob/dm_crypt_v15/Documentation/ABI/testing/crash_dm_crypt_keys
[2] https://lists.fedorahosted.org/archives/list/[email protected]/message/Y3KUSJQPN3JHUUC2FPIK7H4HTSX2TUCX/

coiby added 4 commits June 4, 2024 15:36
Signed-off-by: Coiby Xu <[email protected]>
Based on the new kernel feature that dm-crypt keys can persist for the
kdump kernel [1], this patch which is adapted from [2]
1) ask the 1st kernel to save a copy of the LUKS volume keys
2) ask the kdump kernel to add the copy of the LUKS volume keys to
   specified keyring and then use --volume-key-keyring the unlock the
   LUKS device.

[1] https://github.com/coiby/linux/blob/dm_crypt_v15/Documentation/ABI/testing/crash_dm_crypt_keys
[2] https://lists.fedorahosted.org/archives/list/[email protected]/message/Y3KUSJQPN3JHUUC2FPIK7H4HTSX2TUCX/

Signed-off-by: Coiby Xu <[email protected]>
Since commit ffc1ec73b3 ("pid1: add ProtectSystem= as system-wide
configuration, and default it to true in the initrd"), systemd makes
/usr read-only by default and it will cause dracut to not wait for the
LUKS-encrypted devices to be unlocked,

    dracut-cmdline[296]: mv: inter-device move failed: '/tmp/294-daemon-reload.sh' to '/lib/dracut/hooks/initqueue/daemon-reload.sh'; unable to remove target: Read-only file syste

    dracut-cmdline[294]: /sbin/initqueue: line 71: /lib/dracut/hooks/initqueue/work: Read-only file system
    dracut-cmdline[221]: /lib/dracut-dev-lib.sh: line 118: /lib/dracut/hooks/initqueue/finished/devexists-\x2fdev\x2fmyvg\x2fluks_lv.sh: Read-only file system
    dracut-cmdline[221]: /lib/dracut-dev-lib.sh: line 103: /lib/dracut/hooks/emergency/80-\x2fdev\x2fmyvg\x2fluks_lv.sh: Read-only file system

Fix the above issue by making /usr writable.

Signed-off-by: Coiby Xu <[email protected]>
Send reuse command to /sys/kernel/crash_dm_crypt_keys [1] to reuse saved dm
crypt keys.

[1] https://github.com/coiby/linux/blob/dm_crypt_v15/Documentation/ABI/testing/crash_dm_crypt_keys

Signed-off-by: Coiby Xu <[email protected]>
Copy link
Collaborator

@prudo1 prudo1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @coiby,
As far as I can tell the overall code looks fine. I've got a few questions and nits to improve though.
Thanks
Philipp

Comment on lines +1007 to +1014
kdump_check_crypt_targets() {
local _luks_dev _count _devuuid _key_desc
declare -a _luks_devs

_luks_devs=($(get_all_kdump_crypt_dev))
_count=${#_luks_devs[@]}

if [[ $_count -lt 1 ]]; then
Copy link
Collaborator

@prudo1 prudo1 Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the point in having _count. It's only been used once and it's not that much shorter than ${#_luks_devs[@]}. Personaly I would use ${#_luks_devs[@]} directly.

Comment on lines +1026 to +1028
for _luks_dev in "${_luks_devs[@]}"; do
_devuuid=$(maj_min_to_uuid "$_luks_dev")
echo "kdump_luks_uuid=${_devuuid} " >"${initdir}/etc/cmdline.d/62kdump_luks.conf"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/>/>>/ ?
Or is it intended that you only record one uuid even when you have multiple devices? I'm a little bit confused as for the udev rules you enter one per device.

Comment on lines +1030 to +1038
_key_desc=cryptsetup:$_devuuid

mkdir -p "${initdir}/etc/udev/rules.d/"
{
printf -- 'ENV{ID_FS_UUID}=="%s", ' "$_devuuid"
printf -- 'RUN+="/sbin/initqueue --settled --unique --onetime '
printf -- '--name kdump-crypt-target-%%k %s ' "$(command -v cryptsetup)"
printf -- 'luksOpen --volume-key-keyring %s $env{DEVNAME} %s"\n' "%%user:$_key_desc" "luks-$_devuuid"
} >>"${initdir}/etc/udev/rules.d/70-luks-kdump.rules"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the point in having _key_desc. It's only been used once.
Plus I would use a here-doc here instead of a row of printfs. Personally I find that a lot better to read.

Comment on lines +1047 to +1051
mkdir -p "${initdir}/etc/systemd/system.conf.d"
{
echo "[Manager]"
echo "ProtectSystem=false"
} >>"${initdir}/etc/systemd/system.conf.d/kdump_luks.conf"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same like above. Here-doc?

Comment on lines +1036 to +1040
reuse_loaded_luks_keys()
{
local _luks_dev _count _cmds _cmd _state
declare -a _luks_devs

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the variables are unused. And even _state I don't think is useful. IIUC the function could be shortened to

[[ $(< $LUKS_SYSFS ) == loaded ]] && echo -n reuse > $LUKS_SYSFS
return 0

Just for my understanding, what does reuse do? Does it prevent the kernel to discard the stored keys, when unloading the crash kernel?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, wouldn't it be safer to simply drop this function and always setup the a new key ring when loading the kdump kernel. It's more expensive but otherwise you could end up in a situation like

# dump to luks dev1
$ kdumpctl start
# edit kdump.conf to dump to luks dev2
$ kdumpctl rebuild
$ kdumpctl reload

which, IIUC, should fail as the key of dev1 is now used to decrypt dev2, which most likely should fail.

Comment on lines +1049 to +1052
prepare_luks()
{
local _luks_dev _count _cmds _cmd _state
declare -a _luks_devs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_cmds and _cmd are unused.

Comment on lines +1072 to +1081
if [[ $_state == fresh ]]; then
printf "init %s" "$_count" > $LUKS_SYSFS
fi

for _luks_dev in "${_luks_devs[@]}"; do
_devuuid=$(maj_min_to_uuid "$_luks_dev")
printf "record cryptsetup:%s" "$_devuuid" > $LUKS_SYSFS
done
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need to use printf here. A simple echo "init $_count" and echo "record cryptsetup:$(maj_min_to_uuid "$_luks_dev")" should work just fine.
Plus _devuuid isn't defined local.

local _luks_dev _count _cmds _cmd _state
declare -a _luks_devs

_state=$(< $LUKS_SYSFS)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when $LUKS_SYSFS doesn't exist? E.g. when you are running on an old kernel or one where the required configs aren't set?

Comment on lines +1043 to +1044
echo >"$initdir/etc/cmdline.d/90crypt.conf"
echo >"$initdir/etc/crypttab"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand it correctly, that you are undoing work other dracut modules have done? If so, could you please add a short comment so we still remember it in a couple of years.

Comment on lines +1056 to +1063
if [[ $_state == recorded ]]; then
return 0
elif [[ $_state == reuse ]]; then
return 0
elif [[ $_state != fresh ]] && [[ $_state != initialized ]]; then
derror "Unknow state about the LUKS keys"
return 1
fi
Copy link
Collaborator

@prudo1 prudo1 Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use a switch case, i.e.

case "$_state" in
  fresh|initialized)
    # handled below
   ;;
  recorded|reuse)
    return 0
    ;;
  *)
    derror "Unknow state about the LUKS keys"
    return 1
    ;;
esac

Personally I find that easier to read.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants