-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Oversized executables #747
Comments
That's one of the reasons the multicall binary exists. As for individual binaries, I'm not sure what we can do except trying to reduce the number of dependencies. |
The project could look into https://github.com/lrs-lang/lib, but it looks like a pretty big change and it may remove large amounts of cross-platform support. |
@Hexel I did :) Unfortunately it's linux only. |
The elephants in the room are static linking, ABI compatibility, and as you mentioned libstd. Some demonstrations. all "results" blocks are generated using the following command:
All of the results are going to be pretty specific to x86_64. I'm running debian stable. All of the results below link against libc, the C standard library. You discuss suitability as a systems language. C is a system language, but most of its userland applications, like rust, use a standard library which takes up more than hundreds of kilobytes. The difference as is demonstrated below, is that cargo by default does not use dynamic linking, because it's expected that most users will not (at this point in time) have an ABI-compatible rust standard library installed. approachesRust, using libstd, static linking to libstdcommand
results
Rust, using libstd, dynamic linking to libstdlibstd is about 4.4mb and has to live on the operating system. command
results
Rust, no libstd frameworkcommands
results
C++, dynamic linking to libstdc++Provided on my system by libgcc, which links directly against libm, so together they take up about 1.1mb total. commands
results
Ccommands
results
Can we just do everything dynamicallyImaginably if a distribution like debian or fedora has a libstdc++ that c++ programs can be compiled against, why not do this for rust? Rust currently lacks ABI stability.Well, it's the same and it isn't. C++ these uses a (relatively much more) stable ABI - it usually only changes with a major standards change. This means that when libstdc++ is compiled with a slightly newer version of clang you can install that on your system without also upgrading binaries that use libstdc++ and were compiled with an older version of clang. When it has ABI stability, we'll be able to compete.Rust doesn't have that yet, it's in the works as the Rust developers know it's needed to be able to ship binaries that are not so tightly coupled. But in the meantime, if you have a library that was complied with rustc v1.2 and you upgrade it and this new version is compiled with rustc v1.5, all binaries and libraries you were using that were linking against that library also now need to be replaced with versions of themselves compiled with rustc v1.5. At some point in the future, there will be a stable ABI, and some systems will begin installing libstd as a dependency for some other tool. And for systems which have libstd, not only will the footprint of uutils coreutils per se is going to be a few hundred kb less, but we'll be able to painlessly split it into one binary per tool. In the meantimeIn the meantime, the best option is just as we're going right now - statically include parts of libstd that we need, then use a "multicall install" which imitates a binary for each tool we're providing via symbolic links. In terms of speed, dynamic or static linking really has a negligible difference. |
creating linkback for #140 |
@nathanross Dynamic linking is not the answer, nor is multicall. Rust programs already dynamically link to The "features" libstd provides on top of libSystem are minimal—primarily structural—and in trivial programs ought to be removable. And indeed they can be, as LRS-lang demonstrates, but this requires undoing design flaws from Rust. And multicall works rather poorly on windows. The solution instead is to judiciously code the binaries as close to the 80K minimum as possible. |
@alexchandel: how would a C library help with providing e.g. Rust-style string formatting? A viable solution on Windows might be multicall built as a dylib, plus a small stub binary that just calls the main entry point in that library. The latter could use #![no_std] to ensure the smallest possible size. |
@vadimcn It doesn't need to, because string-formatting makes up relatively little of these binaries, and is inlined little with relative ease (once you stop panicking). If you actually read the disassembly for PROFILE=release cp, you'll find that the largest symbols by far (at 34% of the text section) are For comparison, I'm working on an |
fascinating @alexchandel your continued investigation into, and passion about, this topic is greatly appreciated. |
@alexchandel Have you done any additional test since this issue was last discussed? It has been more than 2 years now and rust has changed quite a lot. Would be interesting to see how the binary size was influenced by this. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Some notes from playing with the size: I added:
For multi call I tried out the performance of the size optimized version and it wasn't too bad but there's probably some tweaking to find a balance of size and performance
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Still important |
yeah, this is why I removed the wontfix ;) |
Ah sorry, I didnt see that update |
no worries |
So, i have been thinking about that. For example Example in Debian: AFAIK, it works without issue. I don't think there is a significantly better way to improve this. |
Agreed multicall helps a lot with the size but there are also improvements with those cargo settings I listed above
Should any of the settings be adopted? |
Not yet Would you like to try to submit a PR? |
I got to this place, because I wanted add 4.67 MB of uutils vs 1.02 MB of coreutils in Alpine. Is this something which should be addressed on distribution level or here? |
@okias that depends on what Alpine is already doing. Are they using all the settings for |
@okias we didn't. With
but researching this I discovered
it's still a bit smaller but perhaps not enough for the needs of @tertsdiepraam . https://git.alpinelinux.org/aports/tree/testing/uutils-coreutils/APKBUILD |
Interesting, thanks! That's not quite enough indeed. This deserves some more investigation. However, I do want to set expectations, there's probably nothing we can do that will immediately cut the size to 1/4 of the current size. Alright, so some questions first (these are both questions you might be able to answer and just open questions I want to investigate):
As a first data point, here's some output of
|
The uutils executables are a bit larger than their native counterparts. These are the stats on OS X with O3, LTO, and alloc_system:
I think the funniest one is
nl
, which is 6300% larger than the nativenl
. jemalloc would've added another 230K to each of these.I realize some of this is Rust's fault: when an optimized, LTO'd, alloc_system'd
fn main(){println!("Hi!\n");}
is still 84K, there's not much room. For example from the object dump/disassembly, about 9% of that dead weight was panicking code & string literals for the standard library :\ If we're really condemned to that, and to an 80K hello world, with all the implied overhead (and it's clearly to scale, as seen above), then this raises serious doubts about Rust as a system language.But surely we can shed some of the remaining 196K/216K/etc off of tr/tsort/friends? The median size of the native executables is 8.0K.
The text was updated successfully, but these errors were encountered: