Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Improve Cargo target-specific dependencies #1361

Merged
merged 3 commits into from
Jan 29, 2016

Conversation

alexcrichton
Copy link
Member

Improve the target-specific dependency experience in Cargo by leveraging the
same #[cfg] syntax that Rust has.

Rendered

Improve the target-specific dependency experience in Cargo by leveraging the
same `#[cfg]` syntax that Rust has.

[Rendered](https://github.com/alexcrichton/rfcs/blob/cargo-cfg-dependencies/text/0000-cargo-cfg-dependencies.md)
@alexcrichton alexcrichton added the T-dev-tools Relevant to the development tools team, which will review and decide on the RFC. label Nov 10, 2015
@alexcrichton alexcrichton self-assigned this Nov 10, 2015
@alexcrichton alexcrichton added the T-compiler Relevant to the compiler team, which will review and decide on the RFC. label Nov 10, 2015
@alexcrichton
Copy link
Member Author

Tagging as T-compiler for the new flag and T-tools for cargo

  • (new rustc flag is --print cfg)

information Cargo will query the compiler via a new command line flag:

```
$ rustc --print cfg
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be done without any rustc changes by having Cargo and compile and run a little shim program e.g. fn main() { println!("{}", cfg!(target_os = "macos")) }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow that's an excellent point! I hadn't even considered that part. This would have the nice benefits of:

  • Doesn't require changes to the compiler
  • Allows newer Cargo releases to work with older compilers

This does, however, have a drawback that Cargo has to have a whitelist of all possible #[cfg] annotations that can possibly be defined. This means that if the compiler adds new ones over time (e.g. all the simd-related ones) or simply adds new values (e.g. a new OS or a new target_pointer_width, god forbid), it wouldn't automatically be picked up by Cargo.

In terms of forward compatibility I think I'd prefer to stick with --print cfg to be sure that Cargo stays as close to the compiler for as long as possible (although it is likely to inevitably diverge). What do you think though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does there need to be a whitelist? Can't Cargo just try if it works?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah of course, I shall clarify! So in theory given a set of packages you've got a list of #[cfg] annotations to match against. We can generate a program, figure out answers to all these, but the process is then iterative. Each time a dependency is resolved it may bring in more dependencies, each of which may have #[cfg] annotations. Although Cargo can build a cache of matched cfg directives, this cache will be iteratively built and and is very expensive to create as each iteration involves compiling a program and then running it.

Basically, we don't understand the full set of cfg annotations we need to match against at the start of dependency resolution, so we'd have to invoke the compiler multiple times. In the limit we have to invoke the compiler many times! Note that this also happens on every single invocation of cargo build.

To ensure that we don't regress too much in speed I wanted to opt for a solution where Cargo can quickly learn about the entire set of cfg and then do all the processing in-memory (much faster than spawning a process).

Does that make sense in terms of the constraints here? I can also be sure to add some of this to the RFC as well!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative would be to implement a cfg_str! macro or such that returns the value of such configuration options as target_vendor and others defined in rustc_back::target. This would alleviate the need to check each value seperately, and would also not require the addition of a command-line option to rustc.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list of needed cfg key-value pairs could be cached between runs so the iterative generation would only be needed on a clean build or when something in the dependency graph changes what values it requests. In those cases you're almost certainly (re)compiling crates anyway, so a bit of extra time in dependency resolution shouldn't matter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jethrogb

That's actually being proposed in #1258 as well, and although it doesn't suffer the forward compatibility problem quite as much it still does to some degree because Cargo may not know the keys to query for.

@wthrowe

That's a good point! I guess I'd have to time both strategies and see what came out on top, but I think my preference would still be to push on --print cfg if possible, but this sounds to me like at least a reasonable fallback!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah actually @arcnmx pointed out below that this won't work for cross compiles as you can't tell the compiler "compile as if you were this target but actually generate a binary for the host" which unfortunately I think kills the idea of generating a binary with println! + cfg! and then running it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could in theory have cargo try to parse the output of -Z pretty-expanded output, but that seems like a nightmare in comparison to just adding a new rustc flag :)

@briansmith
Copy link

This looks great. I would rather not have it bundled with a bunch of other unrelated changes just because they all happen to not be 100% backward compatible.

@retep998
Copy link
Member

Looks good to me.

@alexcrichton
Copy link
Member Author

@briansmith

Ah yeah I should clarify that the other changes I have in mind are all backwards compatible, just not necessarily forwards compatible. An example of this is that I have a change in mind to add more entries to Cargo.lock. This means that when you build with an older Cargo it will remove these entries, but when building with a newer Cargo it will re-add these entries. Not a huge problem, but something at least to consider!

Regardless though I do personally agree that this should happen independent of other changes. Always good to have a nice pipeline of features on the way!

`--target` (Cargo's current behavior).

> **Note**: There's an [issue open against TOML][toml-issue] to support
> single-quoted keys allowing more ergonomic syntax in some cases like:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no feeling for how TOML would respond to changes like this. Would they implement such a change? If so, when will they implement such a change? The backslash syntax is pretty ugly for us in the mean time. How long would we have to live with it? An alternative is to support the inverted syntax from Rust: [target."cfg(target_os = 'macos')".dependencies] although that might be confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question! I'll cc @BurntSushi, our resident TOML expert.

In terms of implementation, as soon as this is merged into the TOML spec I can update toml-rs to implement the parsing and merge it into Cargo ASAP. If that happened before this RFC and/or implementation landed then it'd all happen at the same time :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any update on this, by the way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I completely missed this ping. AFAIK, keys are either bare or quoted. If they're quoted, then they follow the same rules as strings themselves:

Keys may be either bare or quoted. Bare keys may only contain letters, numbers, underscores, and dashes (A-Za-z0-9_-). Note that bare keys are allowed to be composed of only digits, e.g. 1234. Quoted keys follow the exact same rules as basic strings and allow you to use a much broader set of key names. Best practice is to use bare keys except when absolutely necessary.

This means that all of the string variants listed here are valid as keys (including multi-line keys, by my read of the spec): https://github.com/toml-lang/toml#user-content-string

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In reading the spec, however, there's a difference between a string literal and a "basic string", e.g. the term "basic string" is used to reference strings that are surrounded by double quotes, and the term for strings surrounded by single quotes is "literal string". This interpretation is also confirmed by the proposed ABNF where the key production only allows for quotation-mark-surrounded strings.

Now this may all just be a misunderstanding, though, as it may be intended that literal (single-quote) strings are allowed in keys as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about literal strings? That looks like what we'd need here...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcrichton Interesting. You might be right about that!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the issue for those following along at home: toml-lang/toml#354

@arcnmx
Copy link

arcnmx commented Nov 11, 2015

Can we also discuss how this itegrates with build scripts?

  • Will cfg items exposed through build scripts like println!("cargo:rustc-cfg=etc") be available or will these conditions be evaluated beforehand?
  • It will be really nice if build scripts can take advantage of this functionality too! I like the --print option, or cargo could provide another environment variable with its results.

The proposed alternative around compiling and running a program won't work well in cross-compiling environments or those where a linker, libstd, or libcore aren't available even if you could execute binaries produced by rustc. And cross-compiling is where this RFC's functionality is very useful!

@alexcrichton
Copy link
Member Author

Right now the resolution and compilation steps in Cargo are very separate, and this is done to ensure that resolution does not require taking time to compile dependencies just to discover it may not have needed to compile those dependencies, for example. As a result I don't belief that any cfg annotations exported by build scripts will be available for use here (they're just too dynamic), but it's a good point and should be documented!

Also note that running a binary should be fine for Cargo to do as it always compiles for the host architecture in that case and then informs the binary about the target (e.g. how build scripts work today).

@arcnmx
Copy link

arcnmx commented Nov 11, 2015

@alexcrichton

Also note that running a binary should be fine for Cargo to do as it always compiles for the host architecture in that case and then informs the binary about the target (e.g. how build scripts work today).

Sure, and then your resulting cfg!() values will be completely wrong if you were to omit --target when building it!

@alexcrichton
Copy link
Member Author

Oh right of course, I didn't connect those dots (thought you were talking about custom build scripts)

@arcnmx
Copy link

arcnmx commented Nov 11, 2015

Sorry, yeah, those were separate. In any case it would be very useful for build scripts to be able to access this information in some form: either by using rustc --print cfg or cargo sending along the info in an environment variable!

@alexcrichton
Copy link
Member Author

Oh I see what you meant now! Yeah I could see Cargo passing along something like TARGET_CFG_<val> down to build scripts for what was discovered from --print cfg, although I'd personally prefer to punt on that for now. Build scripts at least are much more amenable to pulling in a library which parses target strings and perhaps gleans information from them much more readily.

@Stebalien
Copy link
Contributor

Alternatively, allow users to specify custom targets and predefine a few useful targets.

Predefined targets:

[target.x86_64-pc-windows-gnu]
cfg = 'all(target_os = "windows", target_arch = "x86_64", target_env = "gnu")'

# for all triples...

[target.macos]
cfg = 'target_os = "macos"'

# for all OSs...

[target.unix]
cfg = 'unix'

Example usage:

[target.unix.dev-dependencies]
libc = "*"

On compile, cargo would include all targets with matching cfgs.

The primary benefit is that targets get descriptive names which can be used multiple times. That is,

[target.sane]
# no one wants to repeat this...
cfg = 'any(target_os = "macos", target_os = "linux", feature = "sane")'

[target.sane.dev-dependencies]
quickcheck = "*"

[target.sane.dependencies]
some_other_lib = "*"

The secondary benefit is that you don't have the weird [target.cfg(....).something] syntax.

@retep998
Copy link
Member

@Stebalien What would windows be? cfg(windows) or cfg(target_os = "windows")?

@Stebalien
Copy link
Contributor

@retep998 Is there a difference between the two?

@retep998
Copy link
Member

@Stebalien I hope there isn't. It would have to be a really weird obscure platform for it to be both Windows and yet not Windows.

@jethrogb
Copy link
Contributor

cfg(windows) means that Target.options.is_like_windows is true. cfg(target_os = "windows") means that Target.target_os is "windows". One could imagine target_os being something different like "wince" or "winphone".

@Stebalien
Copy link
Contributor

wine versus windows would probably be a better comparison. I guess we'd need os_windows, os_linux, os_macos, unix, and windows.

@alexcrichton
Copy link
Member Author

@Stebalien that's an interesting suggestion! Sounds kinda like #831 and it's a good point that the #[cfg] syntax today can get quite wordy, even if it's less wordy than explicitly listing each triple today

@retep998
Copy link
Member

@daggerbot That example would actually be incorrect for cross compiling scenarios where cfg!(windows) for the host triple could be different than for the target triple.

@Stebalien
Copy link
Contributor

@daggerbot this would be a usability/tooling nightmare.

@sfackler
Copy link
Member

@rust-lang/tools any chance we can start moving on this? Looks like conversation has died down.

@alexcrichton
Copy link
Member Author

@sfackler yes I hope to place this into FCP during our next triage meeting, although we meet every other week instead of every week.

@alexcrichton
Copy link
Member Author

🔔 This RFC is now entering its final comment period 🔔

@alexcrichton alexcrichton added the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label Dec 16, 2015
@Stebalien
Copy link
Contributor

I still feel very strongly that my modified version (#1361 (comment)) is much more readable/maintainable.

@retep998
Copy link
Member

I think @Stebalien 's idea looks nicer than what is originally proposed in this RFC.

@alexcrichton
Copy link
Member Author

@Stebalien

I definitely agree that some form of aliasing would be nice, but I'm hesitant to move on it just yet. I think that there's a pretty nice backwards-compatibility story where we can support the syntax in this RFC but then later add the ability to alias later on in the future. Some downsides of pushing harder on having these aliases today are:

  • It's less consistent with Rust itself, which this syntax is targeted at mirroring. Rust may one day gain the ability to alias a bunch of #[cfg] options, and we'd probably want to make sure the Cargo and Rust syntaxes are the same.
  • New platforms don't gain the benefit of target.platform.dependencies immediately, they'll have to wait for new versions of Cargo to be released down the road which includes these aliases. Whereas with scraping rustc --print cfg syntax means that Cargo will automatically support the new platform with normal syntax as soon as a compiler is found that supports it.

I think that in any world we're likely to want the more exhaustive syntax as an option at least, so for now I'd prefer to stick to the more conservative route that mirrors Rust closesly and isn't too complicated.

How does that sound?

@Stebalien
Copy link
Contributor

It's less consistent with Rust itself, which this syntax is targeted at mirroring. Rust may one day gain the ability to alias a bunch of #[cfg] options, and we'd probably want to make sure the Cargo and Rust syntaxes are the same.

Most cross platform libraries I write look like this:

// platform dependent code
mod impl {
    #[cfg(any(target_os = "macos", target_os = "linux", feature = "sane"))]
    mod sane;
    #[cfg(any(target_os = "macos", target_os = "linux", feature = "sane"))]
    use sane::*;
}

Which is exactly what I'm trying to achieve here. I actively avoid putting cfgs inline in the body of my code to avoid copying them around and ifdef hell.

New platforms don't gain the benefit of target.platform.dependencies immediately, they'll have to wait for new versions of Cargo to be released down the road which includes these aliases. Whereas with scraping rustc --print cfg syntax means that Cargo will automatically support the new platform with normal syntax as soon as a compiler is found that supports it.

I doubt the core set of platforms will change much over time. Regardless, we can simply ship a platforms.toml with rust.

I think that in any world we're likely to want the more exhaustive syntax as an option at least.

I disagree. Putting cfgs in toml keys looks really hacky to me and is no more flexible.

@alexcrichton
Copy link
Member Author

I would personally like to keep the scope of this RFC to be relatively small, so if we start diving into the realm of shipping TOML configuration with the compiler it's straying pretty far from the intentions. I also think it's important to consider what the typical usage of this will look like.

For example, I suspect most target specific dependencies fall under cfg(windows) or cfg(unix), in which case the difference between the two I think is subjective

[target."cfg(unix)".dependencies]

# ...

[target.unix.dependencies]

If you start going outside that realm, I suspect that most dependencies are sort of one-off configured. I would consider there to be a drastic difference between

[target."cfg(target_arch = \"x86_64\")".dependencies]

# ...

[target.x86_64]
cfg = 'cfg(target_arch = "x86_64")'

[target.x86_64.dependencies]

If you note that we shouldn't support the explicit syntax, then I think that if we have to provide a built-in configuration for all possible "common permutations" to make the usage ergonomic is a bit of a non-starter. There is also the question of when Cargo sees target.<string>.dependencies, how does it know whether string is an alias for more cfgs or whether it's an actual custom target spec? Today it just treats it entirely opaquely.

@Stebalien
Copy link
Contributor

For example, I suspect most target specific dependencies fall under cfg(windows) or cfg(unix), in which case the difference between the two I think is subjective

I completely agree. I just subjectively find [target."cfg(unix)".dependencies] to be ugly (specifically, the internal quotes). 😉

If you start going outside that realm, I suspect that most dependencies are sort of one-off configured ... If you note that we shouldn't support the explicit syntax, then I think that if we have to provide a built-in configuration for all possible "common permutations" to make the usage ergonomic is a bit of a non-starter.

You're right. This is a significant problem with my design.

There is also the question of when Cargo sees target..dependencies, how does it know whether string is an alias for more cfgs or whether it's an actual custom target spec? Today it just treats it entirely opaquely.

It's always an alias for a cfg:

let mut config = parse();
let targets = config.remote("target");
for (_, target) in targets {
    if eval_cfg(target["cfg"]) {
        config.merge(target);
    }
}

One off configs are a good point. I'm still not happy with the syntax but I'm also no longer very happy with mine so, no more objections.

However, it would be nice if we could get TOML to treat parentheses as a form of quotes (token tree quotes?). That is, allow target.cfg(not(target_os = "linux")).dependencies but not target.cfg()).dependencies.

# Detailed design
[design]: #detailed-design

The target-specific dependency syntax in Cargo will be expanded to to include
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: "to to"

@nrc
Copy link
Member

nrc commented Dec 20, 2015

lgtm. I'm in favour of the version proposed in the RFC - the symmetry with rustc is nice - the less Cargo is a special world and more like Rust, the better, IMO. Sounds like target aliases are nice to have too, but (IIUC) implementing the RFC as proposed and later adding a generic target alias mechanism would subsume the alternative proposed above.

I would really like to drop the "" if possible.

@tbu-
Copy link
Contributor

tbu- commented Dec 23, 2015

The Rust syntax doesn't really look good in TOML in my opinion.

@michaelwoerister
Copy link
Member

The version proposed in the RFC looks good to me .

@brson
Copy link
Contributor

brson commented Jan 13, 2016

I'm torn but also kind of ambivalent and want this to get done. The aesthetics of @Stebalien's is a lot better to me. The indirection it adds with the target 'aliases' is good and bad: it could avoid lots of cfg duplication, but requires more code to set up than the OP RFC. I'm not sure how I feel about predefining cfg target aliases. @Stebalien's looks to me like sugar over the original proposal, but adding both is pretty bloaty.

@alexcrichton
Copy link
Member Author

We discussed this in the tools triage meeting this past Wednesday, and the conclusion was that there wasn't enough consensus to move forward with the proposed syntax yet.

Syntax such as @Stebalien's proposal is easier to read, but unfortunately also has downsides. It was decided that it was probably too complicated to implement one with the intent that if it didn't work another would be implemented, so we wanted to narrow it down ahead of time (e.g. now).

One part of the discussion I believe we should assume is that TOML will not change. I suspect that it's pretty unlikely the spec will change much at this point, especially for features such as "this is ok so long as there are balanced delimeters" (although it would be quite nice...)

@jethrogb
Copy link
Contributor

One part of the discussion I believe we should assume is that TOML will not change.

Not even the " vs. ' internal quotes?

cc @BurntSushi

@alexcrichton
Copy link
Member Author

That's something we might be able to modify TOML with, but even that may be a bit of a stretch and I wouldn't want to hinge acceptance of this RFC on a modification like that.

@BurntSushi
Copy link
Member

@jethrogb Sorry, I hadn't seen that question. I responded.

@jethrogb
Copy link
Contributor

I read through this entire PR again, and aside from @Stebalien's proposal which he withdrew, I don't really see any alternative proposals. The remaining complaints are mostly "doesn't look good". With today's change in the TOML spec allowing us to get rid of the backslashes in all cases I think that means that the currently proposed RFC is the best and only proposal out there right now and should be our way forward.

@alexcrichton
Copy link
Member Author

@jethrogb ah yes thanks for the reminder! To clarify the syntax for strings-in-table-keys in TOML has been tweaked to allow, for example:

[target.'cfg(foo = "bar")'.dependencies]
# ...

I've updated the RFC to reflect this.

@alexcrichton
Copy link
Member Author

For those interested, I've implemented this in a branch, currently in this commit.

@alexcrichton
Copy link
Member Author

The tools team discussed this in triage this past week, and the decision was to merge. The recent change in TOML syntax should allow for much more ergonomic usage of string-like #[cfg] annotations, and it doesn't seems that any one syntax is clearly better than another, so we're going to move forward with what's proposed here.

Thanks regardless again for the discussion everyone!

@alexcrichton alexcrichton merged commit 60e6d04 into rust-lang:master Jan 29, 2016
@ruuda
Copy link

ruuda commented Jan 31, 2017

For future reference: the implementation of this RFC became available with Rust and Cargo version 1.8.0, released 2016-04-14.

@Centril Centril added A-target Target related proposals & ideas A-dependencies Proposals relating to dependencies. labels Nov 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-dependencies Proposals relating to dependencies. A-target Target related proposals & ideas final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. T-compiler Relevant to the compiler team, which will review and decide on the RFC. T-dev-tools Relevant to the development tools team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.