Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SVH based symbol to dylibs #73917

Open
bjorn3 opened this issue Jul 1, 2020 · 9 comments
Open

Add SVH based symbol to dylibs #73917

bjorn3 opened this issue Jul 1, 2020 · 9 comments
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@bjorn3
Copy link
Member

bjorn3 commented Jul 1, 2020

Any upstream dylib and executable can then require that symbol to be present to prevent mixing up different versions of a dylib at runtime. This symbol only needs to exist for dylibs, not cdylibs or rlibs.

See https://internals.rust-lang.org/t/stability-of-dylibs-per-compiler-release/12648/10 for the discussion about this.

@rustbot modify labels: +C-enhancement +T-compiler

@rustbot rustbot added C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 1, 2020
@mzabaluev
Copy link
Contributor

mzabaluev commented Jul 1, 2020

All regular dynamic symbols created by the Rust compiler already embed some form of a hash, though as I understood from the discussion linked above, the source data for that are not sufficient to guarantee against ABI breakage.

@gilescope
Copy link
Contributor

Will try and attempt this as part of #75594

@gilescope
Copy link
Contributor

In compiler/rustc_codegen_llvm/src/coverageinfo/mod.rs fn save_map_to_mod, they do llvm::add_global. Is that a good way to add a symbol?

@bjorn3
Copy link
Member Author

bjorn3 commented Sep 1, 2020

You could do it similar to the dylib metadata writing:

let name = exported_symbols::metadata_symbol_name(tcx);
let buf = CString::new(name).unwrap();
let llglobal =
unsafe { llvm::LLVMAddGlobal(metadata_llmod, common::val_ty(llconst), buf.as_ptr()) };
unsafe {
llvm::LLVMSetInitializer(llglobal, llconst);
let section_name = metadata::metadata_section_name(&tcx.sess.target.target);
let name = SmallCStr::new(section_name);
llvm::LLVMSetSection(llglobal, name.as_ptr());
// Also generate a .section directive to force no
// flags, at least for ELF outputs, so that the
// metadata doesn't get loaded into memory.
let directive = format!(".section {}", section_name);
llvm::LLVMSetModuleInlineAsm2(metadata_llmod, directive.as_ptr().cast(), directive.len())
}

@gilescope
Copy link
Contributor

Hmm, I can't see any section for that code appearing in an OSX executable. I was expecting a section called __DATA,.rustc to be somewhere in the resulting binary, but I can't find it.

@bjorn3
Copy link
Member Author

bjorn3 commented Sep 20, 2020

You may need to set the S_ATTR_NO_DEAD_STRIP section flag for Mach-O objects through some way.

@gilescope
Copy link
Contributor

Am close. Got it coming out for exes and am adding it in for dylibs. Just need to stop it stripping for dylib and definitely can't use the reference it from main trick with a dylib!
I tried .section __DATA,.rust_svh_hash,no_dead_strip but got:

error: mach-o section specifier uses an unknown section type

So there must be some cunning way to flick that on...

@gilescope
Copy link
Contributor

Ok so this is done for dylibs now in the PR, just not for cdylibs yet.

@eddyb
Copy link
Member

eddyb commented Jul 31, 2022

Duplicating/moving from my comment at #99944 (comment):

struct LinkGuard {
    deps: &'static [&'static LinkGuard],
}

extern "Rust" {
    // foo[b7924b40b7ed5e7f]::{shim:LINK_GUARD#0}::<0x461e83de35f0b704f7e69b4cc741ad8eu128>
    #[link_name = "_RINSCsfL95rG4I7iB_3foo10LINK_GUARDKo461e83de35f0b704f7e69b4cc741ad8e_E"] 
    static LINK_GUARD_DEP_FOO: LinkGuard;

    // bar[acb4b2d152c0bd2e]::{shim:LINK_GUARD#0}::<0xf8fc0fadc6a6e727eef4b916531abfe9u128>
    #[link_name = "_RINSCsePjaApBJGQA_3bar10LINK_GUARDKof8fc0fadc6a6e727eef4b916531abfe9_E"] 
    static LINK_GUARD_DEP_BAR: LinkGuard;
}

// my_crate[78009e3fbfa2f6af]::{shim:LINK_GUARD#0}::<0xe538955c5950b59a598304a1e701c9fbu128>
#[export_name = "_RINSCsaiLK1vfX74x_8my_crate10LINK_GUARDKoe538955c5950b59a598304a1e701c9fb_E"]
pub static LINK_GUARD: LinkGuard {
    deps: unsafe { &[
        &LINK_GUARD_DEP_FOO,
        &LINK_GUARD_DEP_BAR,
    ] }
};

The idea being that these LinkGuard statics:

  • form a DAG (reifying the dylibs in the static crate dep graph)
  • can be used with e.g. global constructors to keep them alive, worst case
    • that is, if some platform lacks any other way to make #[used] and similar features work (not sure what the status there is)
    • or if we're being paranoid, there could be some dynamic mechanisms involved to "unlock" various things, making abuse more involved than just renaming the symbols or w/e (but it's not like I actually want to design a DRM scheme for Rust dylibs, it's just plausible :P)
    • on a more serious note, if we wanted some kind of "dynamic TypeId validation", this is the kind of place where it could go (i.e. a global set to check conflicts between the static sets of TypeIds that were computed at compile-time for each dylib)
      I heard Swift does some stuff like that (though their issues are more like "runtime monomorphization", where Foo<Bar> needs vtables/"dictionaries"/"witness tables" and RTTI metadata generated for it, but Foo and Bar never met eachother during compilation, etc.)
  • the v0 symbols are not just a stylistic choice:
    • the _R prefix, and more generally the v0 format, made them markedly not C-like symbols
      • this reminds me, we should probably start warning/erroring for #[link_name]/#[export_name] attribute values that start with _R[A-Z] (doubly so if they happen to parse as v0 symbols), in a similar vein to how we disallow llvm. symbols (or rather, it's feature-gated)
    • v0 symbols natively include the crate disambiguator (hash), and also easily allow embedding constants (using const-generic syntax - used here for the SVH value)
    • there's a lot of encoding space (already supported by demanglers) that doesn't overlap with v0 symbols generated by normal mangling (mostly namespaces, as there's a small of those used by the compiler, and we guarantee pretty much nothing other than if we give a name to an uppercase namespace, that name is "stable", though not necessary any implied semantics)
      • I picked the S (shim) namespace since we already use for compiler-generated functions, but it could've been a lot of other things, like "NL...0", demangling to ...::{L#0} (which demanglers could be taught over time to print in a nicer way) instead of "NS...10LINK_GUARD" demangling to ...::{shim::LINK_GUARD#0}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants