-
Notifications
You must be signed in to change notification settings - Fork 0
Binary size could be smaller #107
Comments
Comment by kbknapp Re-energizing this topic. Not necessarily for action prior to 3.0 release, but something to keep in our mind and begin to tackle as available. My comment from #1891
@CreepySkeleton's comment from that same issue:
Finally, @Dylan-DPC's comment from that issue:
|
Comment by kbknapp Just so my thoughts are known, I don't believe we need to remove all dependencies and features for the smallest possible size. What I do think is that we should know which dependencies have a low "weight to functionality" ratio. "Weight" can assessed in two manners, code size added to clap, and compile time. "Functionality" should be self evident. I am definitely not in the camp of "make sure there are as few deps as possible." However, I am in the camp of I don't want to take on a dep that doesn't provide intrinsic value, or is a maintenance burden. The harder crates to evaluate will be ones that don't add a lot to clap functionally, but at the same time don't add much if any compile time or code size either. I would propose for those style crates we look at either pulling the code internal, or dropping the crate. I'm most concerned with the This would be a great first issue, as it's not a lot of code changing, and doesn't require too much in depth knowledge of clap. At first, it will primarily be measuring the default crates, or seeing how widespread their use inside clap is. Taking a quick look at our default crates I see:
|
Comment by kbknapp Doing a
Gives us this output via
Discounting clap and std, it looks like Also, if we did something drastic like make auto help messages optional, that alone would cut a huge portion of the code along with size. Additionally and oddly enough, that may the easiest thing to modularize first. "But wait!" I can hear you say. "Auto help generation is like THE thing clap is used for, right?" Sure. It's one of the things, but I'd argue not the most important at all. In many of the contexts where code size matters the most, help generation is one the things they don't care about because it's super easy to just write a help string manually for a small CLI. |
Comment by CreepySkeleton Before we more any further, would you please test it in release mode 😄? Nobody cares about debug binaries, release the ones of concern here. |
Comment by kbknapp Literally just edited the comment with that info 😆 |
Comment by pksunkara Did you enable |
Comment by CreepySkeleton main.rs
With lto enabled (release build, no default features)
What interesting here is that the size of the The overall difference is less than 10Kb (clap and its deps only) and it will thin as more features are actually used. All the further runs will be performed without LTO. See also https://stackoverflow.com/a/52297790. TL;DR: LTO is not about reducing code size. Another interesting read: https://internals.rust-lang.org/t/rust-staticlibs-and-optimizing-for-size/5746. No TL;DR here, go read, it's worthy. Kevin, just to make sure we're on the same page: I wasn't counting you as a no-deps guy, I was just thinking aloud "how would we sell clap to those guys" and I'm pretty sure the answer is - we wouldn't. If their wariness comes from concern that the deps affect binary size, we can righteously tell them that all the deps put together (in At the very least, if 5KiB do mater to you, you're working on embedded, and clap will likely never suit you. Take a look at getopts/argh or write it yourself. I agree regarding help messages. It also drags error messages along. What not clear to me is your logic about "either pulling the code internal, or dropping the crate". It sounds like "Let's copy-paste code from third party crates (we can't just drop them) in order to decrease our maintenance burden". But from that point on, we will be the ones who maintain this copy-pasted code. A contradiction. Could you please elaborate on that? |
Comment by pksunkara
I don't think we should be saying that. We should be using cargo feature flags to make sure all the fields can use clap. |
Comment by TeXitoi To make it work on embedded, you'll need do be alloc free (usable wIthout, String, Vec and such kind of collection). |
Comment by Dylan-DPC There's a post 3.0 plan for that already which is why added the |
Comment by CreepySkeleton To clarify my statement about embedded:
To summarize all of that, you either have an extra hundred of KiBs at your disposal (and this is why I think we shouldn't optimize every single KiB) or clap is not for you. And a bit of personal opinion: as someone with a (very limited) experience with embedded, I can say that allocations are almost tabu in there. Given that clap relies on |
Comment by TeXitoi Even getopts requires |
Comment by TeXitoi (embedded = bare metal on microcontroller) |
Comment by kbknapp A lot to reply to here :) I don't think we need to be concerned with embedded (bare metal) right now. Maybe one day, but there are many things we can do that will improve clap before we worry about embedded. I'm more concerned with embedded like, i.e. resource constrained devices, like small arm based (Pi, SBC, phone, etc.). I spoke with some Google employees who were looking at clap for that exact use case (presumably Fuschia), and binary size was their concern. Binary size isn't everyone's concern. To be clear, I'm not trying to supplant We're in a little bit of a catch-22 scenario. Many people assume argument parsing is such simple task that the deps should be ultra small and no transitive deps. That's understandable why one would think that. However, as all of us know, it turns out argument parsing is somewhat of a hidden beast. There are edge cases EVERYWHERE, and a near endless amount of features to implement. It turns out argument parsing is complex. Having said that, the issue we're discussing conflates a few concerns that we should probably be more explicit about:
Both binary size and compile times are affected by:
While adoption and perception can be heavily influenced by binary size, compile times, and the dep graph as it's own construct. So long as we're doing our due diligence when it comes to crates we take on as dependencies to ensure they're worth the price we pay in terms of binary size and compile times, we're good. Alternatively, if they negatively affect those areas, we should do our best to ensure they're optional. Meanwhile, we should be thinking about the dep graph as it's own construct because right, wrong, or indifferent people do care about how many crates they pull in. Unfortunately, everyone's reason for caring can be different, for some it's binary size, for some it's compile times, for others it's the perception of risk (what happens if a crates has a security flaw, or goes away over night?). We're addressing the first two concerns already, but the third (perception of risk) is primarily addressed by either using ultra popular, well supported crates, or crates related to/owned by the parent (which is usually signalled by the name, i.e.
You're correct in that part of the answer is, "we don't." But I think we mean different things by this. I mean we "We don't try to sell them on clap, we just do the best we can to care about that scenario." We can do things like making clap modular enough, or structured in a way to turn off features the end user doesn't care about. This has an added benefit of potentially being able to make most of our deps optional. And where they can't be option, oh well. At least we're trying. clap isn't going to fit every situation and that's fine. But we should at least do what we can to be an option for even those resource constrained situations.
I totally agree with you. However, the problem is many times we don't get the chance to explain ourselves. They see the dep graph front-and-center and many times simply make assumptions from there. That's the part about the "no dep" crowd I'm not a fan of. I can't count the number of reddit comments, or blog posts I've seen that say, "Can you believe [program X] pulled in 170 crates?! That's absurd!!!" with no backing evidence as to why that's absurd. Were those 170 crates not required? How much compile time did they add, etc.? If the compile times were the EXACT same, but [program X] just copy/pasted the code from those crates, you probably wouldn't hear a peep. In fact, you'd probably hear how amazing it was that [program X] has no dependencies! ...I'm not a fan of that. But it's the world we live in.
I should have been more clear 😜 Decreasing our maint burden and decreasing our dep graph are separate concerns that get intertwined. Also taking the code internal is only in the case where we're using a dep for a small well defined subset of what the crate offers, and we aren't concerned with maint burden part, just the raw number of deps part. This should only be a last resort for us. This is one of those things that's counter intuitive; copying and pasting a subset of a crate into clap directly, (or even into a new sub crate like Is it dumb. Yep. But again, this should be a last resort for us and not something I'm actually looking at doing seriously. The only crate that comes to mind for this type of thing is |
Comment by pksunkara There are some good arguments regarding inclusion of deps in this reddit thread today, https://www.reddit.com/r/rust/comments/gi7v2v/is_it_wrong_of_me_to_think_that_rust_crates_have/ |
Comment by BurntSushi @kbknapp invited me to participate in this issue, but I have to say, from reading the comments, it seems like there's an undercurrent of counter-productive "us vs them." If I'm coming from (your perspective) the "anti-dependency" camp and you already believe that you can't sell clap to me
then is it really worth my time to participate here if I'm treated as a lost cause in the first place? reddit and low effort comments are a pit, and people love to make sweeping generalizations that makes the world look more extreme than it is. In reality, I think most people exercise good judgment and try to straddle this line as best as they can. Sometimes I'm on the "anti-dependency" side of the fence and take steps to reduce my dependency tree. But other time, I'm on the "yeah dependencies are great" side and try to resist the urge to remove dependencies. I think @kbknapp has the right attitude here. The struggle is the important part. Doing due diligence on dependencies before bringing them in, and really weighing their costs and benefits, is what's important. And to me, that's what I see myself doing when I notice that a new dependency is about to come into my tree. What I see is a prior state in which I could opt out, but now I can't, and I don't know why. The code that was deleted looked pretty small, its commit history suggests it was very lightly edited and its performance was more than good enough for me. Moreover, when I first dropped the I'm not familiar with Clap internals, and as someone who has written more than one argument parsing library, I am definitely someone who appreciates their complexity. I never take a good argument parser for granted. They are super hard to build because they need to balance so many competing concerns. IMO, Clap does a great job of this already. So with that said, I'll give some opinions on Clap 3's required dependencies:
|
Comment by kbknapp Thank you for the detailed response! Before I can finish reading the comment and add any thoughts, let me address something real quick as a moderation note:
I want to be very clear with everyone reading these messages (especially including the clap team) that this is not what I want to portray or the tone I want to encourage. Internet communications via text are hard, especially when we start considering native languages and such. (Edit: to be even more clear. I don't think @BurntSushi is portraying this tone. He's reading and noticing this tone from previous comments we made. And I should say thank you for pointing this out! 😄 ) I just want to be clear that the clap team is on the same page of, there is no "us vs them" and all opinions are welcome! We have differences of priorities and how to achieve those priorities, but all opinions are valid within this discussion. Ultimately the clap team will have to make a decision on which priority and implementation strategy to go with, but through discussions and issues like these my goal is to take all opinions in, consider them carefully, and make a decision that is best for clap and all consumers (current and future). ...now back to reading 😄 |
Comment by BurntSushi Also, one other note: IIRC, |
Comment by pksunkara
I strongly agree with this. Unfortunately, I only started using Rust this year (maybe Dec last year), so I missed all the deps related discussions and am still trying to catch up on understanding the pros and cons related to rust ecosystem. That is why I was asking you the recent question on deps in termcolor @BurntSushi. In Javascript before, while I do try to reduce our usage of deps, I still try to prefer commonly used packages and using them as deps instead of writing all of them myself which was the convention all across the ecosystem. If we take the case of |
Comment by kbknapp
My understanding is there are several concerns that can also overlap with each other, and not everyone has the same concerns. As mentioned above, some of those include:
Adding a dep can have negative impacts on points 1 and 2, or both. Some people care about point 1, some people about point 2, and some about both. So if we consider either point a deps "cost" we have to weigh that cost against it's benefits (usually performance, or lower maintenance burden for us directly). This is what I was trying to say earlier, we have to weigh any cost (including the subjective ones) against the objective benefit. The reason for making it optional and not a binary include or not include is because people have various requirements. Some may not care about the binary size, or increased compile time if the feature provided is worth the cost, but since some do care we need a way to allow our consumers the same choices we're making at the library level. For To address the comments per dep from @BurntSushi :
|
Comment by BurntSushi
You probably want to stick with I don't know how hard it would be to avoid doing the equivalent of |
Comment by pksunkara I think using There was a PR in the v2 commit history which removed |
Comment by kbknapp
That's entirely possible. I haven't done any investigating on the matter yet.
That was switching |
Comment by CreepySkeleton
I actually looked at the list long time ago and found that
Agreed.
Counterpoints:
I don't really have very serious objections to importing the macro into clap codbase, but I'd rather avoid it nonetheless ("reinventing the wheel" argument). As an argument for importing the macro, I'd say that
This is justified by the need of a specific data structure, and
Yes, we could use a
Well, either std decides to expose the This is core functionality. It breaks, we break. It shines, we shine, too.
90% sure we can simply replace it with |
Comment by bb010g
Would also exposing static help generation via a build script and/or macro be feasible? |
Comment by ssokolow
I'd disagree with that perception. For writing "strongly-typed scripts" that are meant to be tiny, I'd be willing to sacrifice functionality if the gains are proportionate to what's lost, and there do even exist argument parsers designed to be tiny that do ...the deal-breaker is that, last I checked, none of the tiny ones that do That said, I do consider |
Comment by therealprof Just randomly stumbled across this ticket and I do find one argument rather weird in this discussion which is "performance": How is that even a topic worth making size tradeoffs for? In the end, a command line parser (typically) runs once per application invocation while bloat is evident even when the application is not used at all. For this kind of crate size should always have a way higher priority than runtime performance. |
Comment by BurntSushi @therealprof No, performance matters, a lot. This is a problem The main reason why performance matters for CLI parsers is when a lot of arguments are passed to the command, or when the command is repeatedly executed. This can happen pretty easily with things like |
Comment by therealprof
Sure, I wasn't implying that abysmal performance was acceptable. But you don't need superfancy order-preserving hashmaps to get perfectly fine CLI parsing speeds for 99+% of all applications, a regular |
Comment by BurntSushi I see, I didn't realize you were talking at that level of detail. I think at that point, you need a benchmark. I do share your skepticism on that specific point, but would definitely want to subject it to testing. |
Issue by smklein
Friday Oct 19, 2018 at 21:59 GMT
Originally opened as clap-rs/clap#1365
Rust Version
rustc 1.31.0-nightly (e7f5d4805 2018-10-18)
Affected Version of clap
2.32.0
Bug or Feature Request Summary
I compared the binary size of a "hello world" application, with and without clap, and came up with the following data:
Using a Cargo.toml:
And a main.rs (of either this, or a version which only calls
println!("Hello world")
:I ran the following:
$ cargo rustc --release -- -C link-args=-static && strip target/release/hello
The "Non-clap" version of hello world resulted in the following output from bloaty:
Where the clap version resulted in the following:
(Diff for visibility)
The amount of .text being generated here doubles the size of the binary.
Clap is really useful, and I'd love if I could use it on more constrained environments, where binary size might be expensive.
Chatting with @kbknapp , it seems that there are some changes to clap-rs underway to allow more flexible usage of the crate, potentially disabling some features at compile-time to shrink the binary, but I figured I'd file this bug to track.
Are there currently other ways of toggling clap features, to compare binary sizes for this evaluation?
The text was updated successfully, but these errors were encountered: