Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Add an alias attribute to #[link] and -l #1296

Closed
wants to merge 2 commits into from

Conversation

alexcrichton
Copy link
Member

Add a new alias attribute to #[link] and the -l flag which indicates that
the linkage will happen through another annotation to inform how a library is
linked. This is then leverage to inform the compiler about dllimport and
dllexport with respect to native libraries on the MSVC platform.

Rendered

Add a new `alias` attribute to `#[link]` and the `-l` flag which indicates that
the linkage will happen through another annotation to inform how a library is
linked. This is then leverage to inform the compiler about dllimport and
dllexport with respect to native libraries on the MSVC platform.
@retep998
Copy link
Member

I will continue to feel uneasy about any proposal that makes the question of whether to apply dllimport based on kind due to kind=static also causing bundling behavior which in turn causes the search paths to be very different from something passed to the linker normally.

@arielb1
Copy link
Contributor

arielb1 commented Sep 25, 2015

The alias behaviour is complicated or just badly explained.

Does #[link(alias="my_foo",name="bar",kind="static")] create a my_foo alias that refers to bar and marks all symbols under the new alias as static (i.e. not dllimport)? I would prefer to put library aliases under crate attributes, or with a -a my_foo=bar,static -lbar compiler flag.

@retep998

As I understand it, the link kind does not affect bundling, only dllimport/dllexport.

@arielb1
Copy link
Contributor

arielb1 commented Sep 25, 2015

Anyway, why is dllexport not emitted for naked extern { pub fn foo(); }? is it just nonfunctional on MSVC?

@retep998
Copy link
Member

@arielb1 kind=static causes the given library to be bundled, while kind=dylib just passes the library name along to the linker.

@retep998
Copy link
Member

Relevant discussions are on internals and in this issue.

@alexcrichton
Copy link
Member Author

@arielb1

Can you be more specific in what you think needs expanding or better explaining? For example your question seems addressed via this text in the RFC:

#[link(name = "bar", alias = "foo")]

...

These alias forms mean the dynamic library "bar" is linked but also introduces an alias "foo" for the library.

followed by

The dllimport attribute will be applied to all symbols in an extern block if that block has any linkage directive indicating that the symbols are linked via a dynamic library. (e.g. following alias pointers to their concrete linkage directives).

Which is to say, yes, with your example my_foo is an alias for bar and the static linkage indicates that dllimport will not be applied.

Can you also elaborate more as to why you would like aliases under crate attributes as well as a new flag on the command line? This proposes modifying the existing flags and attributes for two reasons:

  • Aliases are closely related to native libraries and how they're linked, so it may make sense to closely related the flags and attributes to consolidate this knowledge all in one location.
  • Aliases are needed on extern blocks to ensure the compiler is able to associate a set of symbols with a native library without actually introducing a new library to link against (as the library may be dynamically determined).

Anyway, why is dllexport not emitted for naked extern { pub fn foo(); }?

Applying dllexport or dllimport is somewhat nonstandard in the sense that you've got to go out of your way to do so, so in my mind it makes sense that if the author did not go out of their way to annotate with #[link] that the compiler shouldn't go out of its way to use dllimport or dllexport.

More technically, however, with a bare extern block the compiler has no current way of knowing where the symbol comes from, so it doesn't know whether it's being included statically or linked dynamically. This can indeed work, even on MSVC, in some situations, but in others it won't. An alternative would be to require all extern blocks annotated with some #[link] directive, but that is unfortunately not a backwards-compatible change.

@alexcrichton alexcrichton added T-lang Relevant to the language team, which will review and decide on the RFC. T-dev-tools Relevant to the development tools team, which will review and decide on the RFC. labels Sep 28, 2015
@alexcrichton alexcrichton self-assigned this Sep 28, 2015
@arielb1
Copy link
Contributor

arielb1 commented Sep 28, 2015

@alexcrichton

#[link] having this kind of side effect just doesn't feel right - and as alias declarations are basically the equivalent of command-line flags, I would prefer to have them at the top-level (like we do with the other command-line-flag attributes).

Actually, -l internal_foo=native_bar,static my be a good enough command-line syntax, but this is totally bikeshed.

@alexcrichton
Copy link
Member Author

I'd be ok with tweaking the CLI syntax, but one of the main aspects of this is to ensure that each extern block has an associated #[link] attribute, enabling the compiler to connect the dots between symbols and native libraries. If there are annotations at the top of the crate, however, could you elaborate on how the compiler can make these connections?

@arielb1
Copy link
Contributor

arielb1 commented Sep 29, 2015

#![crate_type="rlib"]
#![cfg_attr(win32, lib_alias(name="foo-static", lib="gnu32-foo", linkage="static", bundle="false"))]
#![cfg_attr(linux, lib_alias(name="foo-static", lib="foo", linkage="static", bundle="false"))]

#[link(name="foo-static")] extern "C" {
    fn foo_create() -> *const Foo;
}

@alexcrichton
Copy link
Member Author

Ah yeah I considered doing something like that for a bit, but preferred to not use existing forms as it seems ambiguous in that location how "foo-static" is actually linked. As a reader of the code it looks like the library is linked dynamically (as no kind is mentioned), but I'd need more contextual information (e.g. the command line or looking at the top of the crate) to learn how it's actually linked.

By having a new form it explicitly signals "this is linked somewhere else" which serves as a flag to readers that the linkage must happen elsewhere, and it's also an assertion that the linkage will happen elsewhere to succeed (e.g. if you forget for some platform to specify foo-static it doesn't automatically become dynamic).

@arielb1
Copy link
Contributor

arielb1 commented Sep 30, 2015

@alexcrichton

So use alias instead of name to separate the "native library" and "native library alias" namespaces? That sounds like a good idea. I don't like the side-effecty "alias" part.

@alexcrichton
Copy link
Member Author

It's certainly true that we've got quite a bit of leeway in terms of attributes we can add here, #[link] is already "namespaced" so we could add all manner of sub-options! Would you be thinking along the lines of something like add_alias or something like that?

I'm not sure if I'd actually expect the attribute form to be used that often to introduce aliases, I expect that they'll almost always come from the command line. If that's the case this could even remove the ability to introduce a separately named alias via the attribute and only allow those to be added on the command line to start out with.

@arielb1
Copy link
Contributor

arielb1 commented Sep 30, 2015

@alexcrichton

I prefer command-line attributes to have equivalent crate attributes, for these who invoke rustc manually.

@retep998
Copy link
Member

I will support this RFC as soon as it is demonstrated that it handles the second situation described here.

@alexcrichton
Copy link
Member Author

🔔 This RFC is now entering its week-long final comment period 🔔

@alexcrichton alexcrichton added the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label Nov 6, 2015
@brson
Copy link
Contributor

brson commented Nov 17, 2015

I want to make sure that bindgen can understand the model and make rustc generate dllimport/dllexport correctly. @alexcrichton says it should be able to do this.

@nikomatsakis
Copy link
Contributor

OK, I've been banging my head against this RFC and this discussion thread all morning and I'm coming up for air. I'd like to ask for some clarification, because I'm somewhat confused about precisely what scenarios are under discussion. It seems to me that the overall summary here is that @retep998 feels that there are various scenarios in Windows linking which are not covered by Rust's current model. I think that is likely true, but it is really hard for me to be sure, because most of the comments I see are pretty abstract -- I think it'd be helpful to outline these scenarios in terms of actual use cases and libraries. @retep998 can you take the scenarios you summarized in this post here and make a concrete example where each of them arises? Or at least scenario #2, the one you seem most concerned about? Sorry if you've already done this and I didn't find it, in that case maybe you can just point me at it. And please explain it to me like I am an ignorant fool -- I, like many of us, I suspect, do my best to ignore linking at all times except when I am forced to pay attention. :)

Also, in that post you seem to imply that .lib files in windows are always EITHER static OR dynamic symbols, but I assume it can actually be a mix in practice? Furthermore, is that all the options we need to be worried about? Are there are things (like inline directives) that might be relevant here? I'd like to try and have a full picture.

I am concerned that if we adopt a method that is too narrow now, we may find that we have to extend it in ungainly ways later. As @retep998 wrote,

Rust is a native language that integrates into existing native toolchains, so we must respect the large differences in how linking is done across platforms.

It seems to me that basically most of these features and abilities exist for a reason, and we will sooner or later wind up with someone who really wants to use them. But it is ALSO true that the vast majority of people are like me, and just want to work in a cross-platform fashion and pretend the linker doesn't exist, so I hope we can find a design that accommodates both.

@nikomatsakis
Copy link
Contributor

@retep998

OK, I just wanted to drill into one other thing a bit more. The scenario that you are describing here, and which you later summarized as example #2, is (iiuc) as follows:

  1. you have a .lib file with some static symbols.
  2. you have a Rust library linking against this .lib file and being compiled into a DLL.
  3. you do not want to "bundle" the .lib into this DLL, meaning to statically link the code in there.

This is where I get a bit confused. If the code is not compiled into the DLL, it seems to me it must be dynamically linked. And in that case, it seems like we must tell the EXE that we need them to dynamically link this dependency. So perhaps I am confused? Or perhaps my mental model of how linking works is too simplistic?

You then go on to say:

notice that bundling foo into the .rlib is not the solution, the real solution is simply making sure foo is only passed to a single linker invocation and re-exported for use by the other binaries

I do not understand what this "re-exported" means. It sounds to me like you ARE bundling the code of foo (the static, native library) into the DLL, and then you are expecting other consumers to draw the code from the DLL somehow. But why are they invoking the foo code directly anyway? Is it because the Rust DLL is exporting the native functions directly? Or is it because the EXE file is independently trying to use foo?

I am guessing you have a very specific scenario in mind -- e.g., perhaps your crate that re-exports windows APIs? Can you spell it out a bit more so I understand?

@retep998
Copy link
Member

@nikomatsakis

The problem is not bundling the .lib into the .dll. No, the problem is bundling the .lib into the .rlib because that completely changes the search paths used for looking for the .lib. When linking a .lib using kind=dylib the .lib is never touched by rustc itself and only when finally creating a binary (either .exe or .dll) it is passed along blindly to the linker and the linker looks for it (this includes various directories full of system libraries like in the Windows SDK). If kind is changed to static, then the .lib is no longer passed to the linker but rather rustc tries to find it and bundle it into the .rlib. Since rustc is not the linker, it doesn't know about all the paths the linker is configured to search in, and thus does not look in those places for the .lib, which is extremely frustrating when someone is trying to link a static library which is provided by the system and so the linker knows where to look for it but rustc does not, thereby causing an error.

@retep998
Copy link
Member

@nikomatsakis
As for the re-exporting thing. Yes, foo should end up being part of the DLL. If the interface to the Rust DLL either pub use's the extern functions from the .lib or uses those functions in public inlineable/monomorphizable functions (this happens quite often), then downstream consumers would need to link to those functions and thus the functions from the .lib need to be dllexport'd in the Rust DLL.

@nikomatsakis
Copy link
Contributor

@retep998

No, the problem is bundling the .lib into the .rlib because that completely changes the search paths used for looking for the .lib.

Yes, so it seems like bundling the .lib into the .rlib is the wrong thing to do. It seems like the rlib wants to specify "kind=dylib", so that it defers the work to the crate producing a DLL.

If kind is changed to static...

OK so can we drill in on this a bit more. Let's ignore the search path thing for a second and just pretend that rustc knew about the system search paths. Let's ignore Rust too for a second. Can you just tell me what setup you would do in C++ and how it would work?

What I keep hearing seems a bit confusing to me. That is, it seems like you want:

  • a DLL that requires library X, but which does not bundle X
  • not bundling seems to imply that the code for X is dynamically linked
  • from this I believe then that the DLL must include in its header or somewhere that it needs X, so that when the DLL is linked, the linker can also go find X and supply it

This seems like what Rust calls kind=dylib. Presumably then any symbols from X that are needed by the DLL would be marked dllimport, since they are being suppled by this import. But you suggest they need dllexport. From what I understand, dllexport means the code for the symbol is contained in this DLL, and you can find it here. But this would be a bundling, which seems to imply static linking, which we do not want.

So where am I going wrong? (Or is the only reason we don't want static linking because it requires rustc to know about search paths of which it is currently ignorant?)

@nikomatsakis
Copy link
Contributor

OK, so, now that I understand this RFC a bit better, I have a few questions:

How I understand it is that writing #[link alias=...] basically gives a kind of "symbolic name" or "layer of indirection" by which to refer to a library. Essentially if I write #[link alias="Math"], I am saying: "get these from the library Math, which is defined elsewhere". Then I can either supply a concrete definition for Math (#[link alias="Math" name="gnumath" kind="dylib"]) or I can supply that same definition on the command line.

I find the name "alias" here a bit confusing. I feel like what you are doing is making a kind of "abstract library" -- I wonder if we could use specify this a bit differently. I was thinking perhaps we could modify the kind attribute instead (and not add an alias attribute):

  1. There would be a new value for kind, abstract. If I write #[link name="foo" kind="abstract"], it would mean "link to the library foo but i'm not telling you how to find it". It is the same as #[link alias="foo"] in this RFC.
  2. The dylib and static kinds would be extended with an optional path that is the path to supply to the linker. e.g., #[link name="foo" kind="dylib(bar)"]. This would also work on the command line with -l "foo=dylib(bar)". It would mean the same as #[link name="bar" alias="foo" kind="dylib"] in this RFC. For future compat, I'd probably disallow commas in the string bar unless it is enclosed in quotes, e.g., #[link kind="dylib(\"bar,baz\")"]. This gives us room to add additional "arguments" in the future if we wanted.

@alexcrichton @brson does that make sense? What do you think?

@nikomatsakis
Copy link
Contributor

@retep998

My conclusion is simply that Rust currently doesn't support non-bundled static libraries and this RFC doesn't change that fact.

I think this is correct. That is, I think this RFC doesn't really change anything fundamental about our linking model. In that sense, it seems harmless to accept -- a strict improvement on today, without much backwards compat hazard to be afraid of -- but clearly doesn't address all possible linking scenarios (though I gather it DOES let us handle a good bit more than we do today, particularly on Windows).

I think what I was personally a bit confused about was the role of kind=dylib. It seems to me that in rlibs, at least, it is almost always wrong to use anything else. kind=dylib lets the ultimate consumer (the one producing a DLL, static lib, or executable) choose how to fulfill the dependency, which is almost certainly what you want. (They can even link statically, if they choose.) But from discussing with @alexcrichton it seems like using kind=dylib does sort of "bias" things towards dynamic linking more than I had originally realized.

@alexcrichton
Copy link
Member Author

to keep this updated, I intend to tweak this RFC with kind = "abstract" hopefully soon

@alexcrichton
Copy link
Member Author

Removing from FCP until I have a chance to update this.

@alexcrichton alexcrichton removed the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label Jan 27, 2016
@vadimcn
Copy link
Contributor

vadimcn commented Jul 28, 2016

@alexcrichton: The thread about MSVC runtime linkage jogged my memory about this RFC. Is attribute syntax the only outstanding issue here?

@alexcrichton
Copy link
Member Author

@vadimcn yeah I don't think my "hopefully soon" panned out. But yes the intention was to update with @nikomatsakis's suggestion of kind = "abstract" and then also have something along the lines of link_name = "foo" indicating how a library is actually linked.

@vadimcn would you be interested in helping to update this RFC and resubmit it?

@Eh2406
Copy link
Contributor

Eh2406 commented Jul 28, 2016

Sorry if I have misunderstood, I'm out of my depth but enthusiastic. Does the new kind = cdylib change anything about this discussion?

Also am I correct that this RFC will help with dgrunwald/rust-cpython#10 (comment) ?

@retep998
Copy link
Member

@Eh2406 This is for specifying the kinds of libraries that you are linking in, not the type of the artifact that will be produced from your crate. But yes, this RFC has the potential to fix that issue.

@Eh2406
Copy link
Contributor

Eh2406 commented Jul 28, 2016

Thanks for the explanation!

@vadimcn
Copy link
Contributor

vadimcn commented Jul 29, 2016

Sorry for late jumping into this thread. Didn't have the time to read it all back in December...

So if I create a.dll which depends on b.dll which statically depends on a static native library c.lib, then symbols from c.lib which end up as part of the interface of b.dll should be dllexported from b.dll but if they then end up in the interface of a.dll they should not be dllexported since the c.lib symbols don't exist in a.dll. Instead consumers of a.dll need to link to b.dll's import library and get the relevant c.lib symbols from there.

I disagree with this statement. What prevents a.dll from re-exporting symbols from b.dll?

c.lib symbols don't exist in a.dll.

In the worst case we could generate a stub function like this:

// a.rs
mod b {
    #[link(name="b", kind="dylib")] 
    extern "C" {
        pub fn foo();
    }
}
// now foo exists in a.dll !
pub fn foo() {
    b::foo()
}

However, we can do better than that because Windows directly supports dll import forwarding (sorry, couldn't find this info on MSDN proper). We may need to fix the compiler to actually do that, but I do not see any conceptual difficulties.

@alexcrichton: Yeah, I'll try to get this moving again, hopefully soon 😀

@vadimcn
Copy link
Contributor

vadimcn commented Jul 29, 2016

@nikomatsakis, @alexcrichton: After thinking some more about this... Seems to me that it's going to be confusing whether we add alias="..." or kind="abstract"+link_name="...". Do we actually need aliases? Can we simply use the library name to refer to it? Something like this:

  • Keep the #[link] attribute as-is.
  • Allow overriding libraries from the command line: if there is #[link(name="foo", kind="...")] in the source, one can override it with, say, -loverride:foo,<kind>=bar.
  • No support for one extern{} block supplying library name for another one. One can always move #[link] attribute to the block containing the related symbols. Platform-dependent names can be handled via cfg_attr!().

@retep998
Copy link
Member

I disagree with this statement. What prevents a.dll from re-exporting symbols from b.dll?

I'm saying the functions/statics provided from c.lib should not be in a.dll, they should only be in b.dll as the single source of truth because they were only statically linked into b.dll. However I see no problems with a.lib, the import library for a.dll, forwarding the imports from b.dll. An import library definitely can specify imports from multiple DLLs at the same time, I just have no idea how to do this, would probably need support in LLVM to do this.

@retep998
Copy link
Member

Allow overriding libraries from the command line: if there is #[link(name="foo", kind="...")] in the source, one can override it with, say, -loverride:foo,<kind>=bar.

As long as the override can be specified via a cargo build script, then this scheme seems fine to me.

@alexcrichton
Copy link
Member Author

@vadimcn that sounds reasonable to me! I agree that somehow in the source connecting an "abstract name" to a "real name" probably isn't worth it.

@vadimcn
Copy link
Contributor

vadimcn commented Jul 30, 2016

Do we have any real use cases for overriding native library's "kind" from the command line?
This can only work on #[link]'s in the current crate. For upstream .rlib's it can't have any effect, as their code had already been compiled (excluding LTO builds, of course).

@alexcrichton
Copy link
Member Author

@vadimcn not that I know of, the command line bits here were intended for libraries whose name may not always be known when the source code is written, but at compile time a build script can discover the name (e.g. via pkg-config). The build script could the connect that name (unknown at source-writing time) to the name written down in the source.

Overriding the actual kind of the library though (static vs dynamic) was never intended.

@vadimcn
Copy link
Contributor

vadimcn commented Jul 31, 2016

@alexcrichton: In that case I don't see a reason to preserve library's identity. All the compiler really needs to know is whether the library is dynamically linked.
For the "unknown library name" case we could simply allow having #[link(kind="...")] extern {} (without name="...").

@alexcrichton
Copy link
Member Author

@vadimcn yeah that's another possibility, but it doesn't account for the case where linkage can change at compile time. For example Cargo is sometimes statically linked to libcurl (what we ship) but it's dynamically linked to libcurl if you just run cargo build normally.

@vadimcn
Copy link
Contributor

vadimcn commented Aug 1, 2016

Okay, sounds like we need a new compiler (not linker) flag just to specify library linkage on Windows. Yuck.
Any idea how often this sort of stuff is needed? Can it be handled on a case by case basis with cfg flags?

@alexcrichton
Copy link
Member Author

Depends kinda on what you're doing. We're not applying dllimport correctly today anyway, so it's really rare you'll actually get yourself into a situation where you need to apply these tags. (at least that's the theory)

It may be handle-able with #[cfg], and Windows libraries don't change names that much, so we could cut back on functionality perhaps.

@nikomatsakis
Copy link
Contributor

I am fine with not having an "abstract" kind, but it does seem like then sometimes you will be specifying a kind (e.g., dylib) that is kind of a "dummy" value, since you always expect it to be "overridden" by cargo, right?

@alexcrichton
Copy link
Member Author

@nikomatsakis yeah we may not be entirely able to remove "abstract" as function declarations in an extern block may come from different sources depending on how the code is compiled (sometimes dynamically sometimes statically)

@vadimcn
Copy link
Contributor

vadimcn commented Aug 14, 2016

Hey, y'all!
Can you please take a look at the first draft of re-incarnation of this RFC?

(I've decided to dump all the dllexport stuff, because unlike dllimport, it can be applied precisely if we defer it till the linking phase)

@alexcrichton
Copy link
Member Author

@vadimcn looks great to me!

@alexcrichton
Copy link
Member Author

Closing in favor of #1717

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-dev-tools Relevant to the development tools team, which will review and decide on the RFC. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants