Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use available_externally linkage for cross-crate inlined functions #16270

Closed
wants to merge 1 commit into from
Closed

use available_externally linkage for cross-crate inlined functions #16270

wants to merge 1 commit into from

Conversation

spernsteiner
Copy link
Contributor

Cross-crate inlining currently uses internal linkage for functions that have been inlined from other crates. This works fine if the rustc-inlined copy of the function actually gets LLVM-inlined at all of its call sites, but if that doesn't happen, then we end up with an extra copy of the function in the object code. Using LLVM's available_externally linkage allows LLVM to perform inlining, but prevents the extra object code from being generated.

cc #14527

@alexcrichton
Copy link
Member

Seems cool! Do you know how this affects binary sizes in the distribution?

@spernsteiner
Copy link
Contributor Author

Whoops, seems this is not such a good idea after all. The code size reduction is only about 0.2% on average, and there's a slight performance penalty.

Turns out LLVM likes to fiddle with calling conventions on internal functions to get better performance. Changing to available_externally linkage prevents it from doing so. LLVM also makes different inlining decisions based on the calling convention being used.

I think we still ought to do something like this in the future, but rustc would probably need to be a little bit more clever about choosing calling conventions. (Also, we would probably get more code size savings by focusing on monomorphizations and drop/visit glue instead of #[inline] functions, but that would be more complicated to implement.)

@thestinger
Copy link
Contributor

@epdtry: This is a good idea. At -O0 and -O1, only always(inline) functions are inlined. The code would perform better if it called out to better optimized functions in the libraries. You don't need to worry about the calling convention, because we could just be using fastcc for Rust rather than the C calling convention - our ABI is way less stable than fastcc.

bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 8, 2024
internal: Speed up import searching some more

Pushes the sorting to the caller, meaning additional filtering can be done pre-sorting. Similarly a collect call was pushed to the caller for allowing some other filters to run pre-collecting.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants