-
Notifications
You must be signed in to change notification settings - Fork 400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(cpio): improve initramfs image performance and efficiency via cpio reflinks #1531
Conversation
This issue is being marked as stale because it has not had any recent activity. It will be closed if no further activity occurs. If this is still an issue in the latest release of Dracut and you would like to keep it open please comment on this issue within the next 7 days. Thank you for your contributions. |
@ddiss hows the progress going on @Firstyear and @tpgxyz pointers ? @haraldh ping |
I'm back from leave so will return to this in the coming days. Thanks for the ping. |
Please keep in mind that in Other options would be using a crate that makes stronger guarantees or documenting that it's best-effort. |
Ack, understood. Best-effort is fine, as that's what we already get from the kernel with regard to performing COW reflink or splice fallback. Thanks for the clarification. |
Changes since last version:
|
Changes since last version:
|
/packit build |
Changes since last version:
|
I've tacked on one extra commit which adds test coverage for the The |
Signed-off-by: David Disseldorp <[email protected]>
Individual test scripts may change working directory, so relative paths should be avoided. Signed-off-by: David Disseldorp <[email protected]>
Signed-off-by: David Disseldorp <[email protected]>
Crosvm's rust argument library is very small and simple, while still providing helpful functionality. It will be consumed by dracut-cpio in a subsequent commit. The unmodified, BSD licensed argument.rs source is lifted as-is from https://chromium.googlesource.com/chromiumos/platform/crosvm (release-R92-13982.B b6ae6517aeef9ae1e3a39c55b52f9ac6de8edb31). The one-line crosvm.rs wrapper is needed to ensure that crosvm::argument imports continue to work. Signed-off-by: David Disseldorp <[email protected]>
dracut-cpio is a minimal cpio archive creation utility written in Rust. It provides support for a minimal set of features needed to create performant and space-efficient initramfs archives: - "newc" archive format only - reproducible; inode numbers, uid/gid and mtime can be explicitly set - data segment copy-on-write reflinks + using Rust io::copy()'s native copy_file_range() support[1] + optional archive data segment alignment for optimal reflink use[2] - hardlink support - comprehensive tests asserting GNU cpio binary output compatibility 1. Rust io::copy() copy_file_range() rust-lang/rust#75272 2. Data segment alignment We're bending the newc spec a bit to inject zeros after the file path to provide data segment alignment. These zeros are accounted for in the namesize, but some applications may only expect a single zero-terminator (and 4 byte alignment). GNU cpio and Linux initramfs handle this fine as long as PATH_MAX isn't exceeded. Signed-off-by: David Disseldorp <[email protected]>
If configured with --enable-dracut-cpio, call cargo to build the dracut-cpio release binary. Signed-off-by: David Disseldorp <[email protected]>
The new dracut-cpio binary is capable of performing copy-on-write optimized initramfs archive creation, but due to the rust dependency isn't built / installed by default. This change adds a new "--enhanced-cpio" parameter for dracut which sees dracut-cpio called for archive creation instead of GNU cpio. Signed-off-by: David Disseldorp <[email protected]>
dracut-cpio already carries a bunch of unit tests covering compression and GNU cpio extraction. The purpose of these tests is to exercise the dracut.sh --enhanced-cpio code-paths as well as kernel cpio archive extraction. Signed-off-by: David Disseldorp <[email protected]>
Changes since last version:
|
Ping - is there anything I can do to move this forwards? |
I've updated the cover letter to include more recent benchmark results, which include XFS and Dracut zstd numbers for comparison. |
One other note is that we encountered what appears to be a bug in GRUB's Btrfs driver when reading initramfs images with a large number of shared extents: https://bugzilla.opensuse.org/show_bug.cgi?id=1190982 . A Btrfs developer is investigating the issue. |
@haraldh ping - I'd really appreciate some feedback on whether this can be merged. Just to make it clear, there's no change to default behaviour here, the new functionality is only enabled if built with |
Ping again... sorry to be a pain. |
FWIW, @adam900710 posted a patch which addresses the above GRUB Btrfs bug: |
FTR, I've done quite a few test runs with David's reflinking cpio, and have observed no issues, disk space savings on par with current compression. Here is an example:
The performance is also very good, it's faster than "zstd -3 -T0" in almost all cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont see anything at this point why this cant be merged
@haraldh could you review this and add your feedback.
Changes
This patchset attempts to speed up initramfs generation for some common (Btrfs / XFS) setups by having Dracut make heavier use of reflinks (AKA copy-on-write clones) during initramfs generation. A good portion of an uncompressed+unstripped initramfs image is duplicate data, which really shouldn't need to be shuffled around when on the same COW clone capable FS.
This is a rework of my #1148 feature submission. Instead for relying on the GNU cpio patchset for copy-on-write FS optimized reflink I/O, a new
dracut-cpio
tool is provided for cpio archive creation.My motivating factors for dropping GNU cpio in favour of
dracut-cpio
are:dracut-cpio
only needs to provide support for cpionewc
archive creation for extraction by the kernelcopy_file_range()
nativelyThe new
dracut-cpio
functionality is disabled by default. It can be explicitly enabled by building with the--enable-dracut-cpio
configure option and then callingdracut
with--enhanced-cpio
.Performance
Preliminary benchmarks indicate a significant improvement in initramfs creation and kernel extraction times with reflinks, as shown via the Dracut runtime and QEMU boot time values respectively.
Storage utilization is also improved with initramfs reflinks, as shown by the significant reduction in exclusive extents. Shared extents represent space reclaimed through deduplication of initramfs source file and cpio data segments.
xz
zstd
fiemap
datanote: Btrfs
fiemap
results varied significantly across multiple runs, thereforeshared
andexclusive
values above should only be seen as a guide. In contrast, XFS results were consistent across all runs:xz
zstd
fiemap
dataThe Dracut
xz
case corresponds to SUSE Dracut version 055, patched with dracut-cpio reflink support, but configured to run with current SUSE defaults:GNU cpio archiving,
xz -0 --check=crc32 --memlimit-compress=50%
compression andstrip
symbol discard.The Dracut
zstd
case matches Dracutxz
except for compression which is configured to usezstd -3 -T0
.The Dracut reflink case has dracut-cpio alignment and reflinks enabled via
enhanced_cpio=yes
. To ensure successful extent sharing, initramfs compression and symbol discard are disabled, alongside use of a reflink friendly Dracut staging area (tmpdir=/boot
).These benchmarks were performed on Tumbleweed 20210924 (5.14.6 kernel) virtual machines assigned 2 vCPUs and 8GiB RAM. The QEMU/KVM hypervisor host was running the same OS and stored the raw VM disk images in memory (
tmpfs
).Importantly, bootloader initramfs image read times were not evaluated due to a lack of time and existing instrumentation.
Memory backed storage is significantly less vulnerable to the fragmentation effects of CoW reflinks compared to HDDs. Further benchmarking on SSDs and HDDs is necessary, at least before considering
dracut-cpio
as a complete replacement for GNU cpio archive creation. The benchmarks were done on a completely different machine and kernel to #1148 , so shouldn't be used for comparison with those numbers.Special thanks to @Firstyear for helping me get my rust changes in shape for upstream submission.
Checklist