Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forbid wildcard dependencies on crates.io #1241

Merged
merged 2 commits into from
Sep 24, 2015

Conversation

sfackler
Copy link
Member

@sfackler sfackler commented Aug 7, 2015

@seanmonstar
Copy link
Contributor

👍

I warn people not to do so with hyper, but it still happens, and I break
lots of people when releasing new breaking versions.

On Thu, Aug 6, 2015, 10:12 PM Steven Fackler [email protected]
wrote:


You can view, comment on, or merge this pull request online at:

#1241
Commit Summary

  • Forbid wildcard dependencies on crates.io

File Changes

Patch Links:


Reply to this email directly or view it on GitHub
#1241.

@lambda-fairy
Copy link
Contributor

An alternative is to continue allowing wildcard bounds, but let maintainers tighten the bounds post-hoc if any breakage comes up.

Another thing to consider is an --allow-newer option, which tells Cargo to ignore upper bounds during dependency resolution. This is useful when you know a crate will build against a newer dependency, but the maintainer hasn't updated it yet.

Both solutions are used by the Haskell community, which has grappled with dependency issues for quite some time.

(EDIT: I'm not for or against this proposal; just putting these ideas out there.)

@Ericson2314
Copy link
Contributor

Definitely +10000 for post-hoc changing of the constraints. I'd also love it for users to suggest changes to the constraints, and a flag for cargo to go by the user consensus instead of the official constraints (consider abandoned libraries)

This is still useful if we decide its better to start conservative and then relax constraints.

@sfackler
Copy link
Member Author

sfackler commented Aug 8, 2015

Changing dependency constraints for published versions of crates seems like it would wreak havoc on build reproducibility. We could perhaps add a fourth "level" that could only be incremented on the crates.io side, so version 3.2.4's dependencies could be corrected in version 3.2.4.1.

@Ericson2314
Copy link
Contributor

Well definitely log changes.

I suppose one depend on a specific subsets of that "fourth level" but I'd wait to see if reproducibility is a problem in practice. I guess my initial impression is that the change is supposed to be unobservable, but I haven't thought that through.

@sfackler
Copy link
Member Author

sfackler commented Aug 8, 2015

Hmm, I think I might have been thinking about that wrong and reproducibility wouldn't be an issue.

@nrc nrc added T-libs-api Relevant to the library API team, which will review and decide on the RFC. T-dev-tools Relevant to the development tools team, which will review and decide on the RFC. labels Aug 9, 2015
@Diggsey
Copy link
Contributor

Diggsey commented Aug 10, 2015

Crazy idea: disallow wildcard dependencies, but allow lower bounds. In that case, when a dependency is updated to a new major version, automatically build and test with the updated dependency. If the build, or any of the tests fail, automatically insert the correct upper bound.

Realistically, that's all the crate maintainer is going to do anyway, so may as well automate it!

Combined with the ability for maintainers to modify dependency bounds after publishing in response to bug reports, this would provide very precise and fine-grained control of dependency bounds, while minimizing the burden on maintainers of responding quickly and publishing new versions when one of their dependencies is updated.

@alexcrichton
Copy link
Member

I'm in favor of this RFC, the motivation is quite right in that not having an upper bound on a version constraint is basically guaranteed to break at some point in the future.


@lfairy

An alternative is to continue allowing wildcard bounds, but let maintainers tighten the bounds post-hoc if any breakage comes up.

I agree with @sfackler that this is not a great idea for reproducibility. A lock file indicates a solution to the dependency graph provided by the set of Cargo.toml files in play, but this only works because the constraints in place by the dependency graph are constant. If they were allowed to change over time (e.g. the bounds are changed) then previously generated lockfiles which represented a solution will no longer be a valid solution.

It could in theory be possible to track revisions and then you lock to a particular index file at a particular point in time, but that seems much more complicated than just rejecting * dependencies which, as @sfackler pointed out, are basically guaranteed to break in the future.


@Diggsey

Crazy idea: disallow wildcard dependencies, but allow lower bounds. In that case, when a dependency is updated to a new major version, automatically build and test with the updated dependency. If the build, or any of the tests fail, automatically insert the correct upper bound.

Unfortunately I don't think this can be done in a bulletproof fashion. Your dependencies can become part of your own public API if you just export or return one of the types, so an upgrade across the major version of a dependency may not break you but could break your own downstream clients. It generally seems like the best practice is no matter what still "have an upper bound".

If you really want to keep the current behavior then dependencies can look like:

foo = ">= 0.0, <= 10000"

(or something like that)

@bluss
Copy link
Member

bluss commented Aug 11, 2015

I wish to solve this too, the way "*" is used as default now is worrying.

I imagine a few crates actually want the convenience of "*", if they and dependencies are constantly changing. For example crates tracking nightly, syntax extensions could be in this position.

@Diggsey
Copy link
Contributor

Diggsey commented Aug 11, 2015

If you really want to keep the current behavior then dependencies can look like:

I was specifically interested in avoiding the time pressure: when an upstream crate upgrades to a new major version, it basically has a domino effect, where all downstream crates have to update as quickly as possible, as soon as all their dependencies are updated to use the new version. Invariably, there's one crate whose maintainer is on holiday or busy or whatever, which results in the entire ecosystem being held back.

On the other hand, allowing upper bounds to get ahead of the actual versions of crates seems to defeat the point of removing wildcard support: if you can continue to use "<= 1000" then for all intents and purposes, it's a wildcard bound.

While automated upgrades as I described would only be as good as the tests, there's nothing to say that their maintainers do more than just rerun their test suite with the new version anyway.

Another option would be to allow upper bounds to only go say one or two major versions ahead of the actual published crate. With a little help from cargo, that could be quite powerful (eg. an easy way for the maintainer to switch from using the latest published version of a crate, to one tagged 'beta' in the associated git repo, would allow them to also test one version ahead, and allow everyone to update their crates in parallel ready for when one of their upstream crates has a new major version)

@gsingh93
Copy link

While I'm mostly for this RFC, I'm concerned about the point @bluss brings up that there may be some legitimate uses for wildcard dependencies. Maybe there could be some opt-in way to allow wildcard dependencies, which would discourage the majority of people from enabling it unless it was really needed? Or maybe if this is something only unstable projects need, we could allow this for projects with a major revision number of zero (I don't know if this actually means unstable in semver, but it's how I use it)?

@sfackler
Copy link
Member Author

Super unstable projects are not really served well by crates.io's current setup - you end up with tons of versions that live in S3 forever wasting space.

Maven has SNAPSHOT versions in addition to normal release versions, which differ from normal release versions in that they're mutable. The dependency is simply overwritten every time it's updated.

@seanmonstar
Copy link
Contributor

Could "super unstable projects" just be depended on via a git dependency?

[dependencies.super_unstable]
git = "https://githuib.com/super/unstable"

@sfackler
Copy link
Member Author

Yeah, that is definitely the other alternative, and now that I think about it, seems like a better method than SNAPSHOT publishes given that Cargo has built in git support.

@codyps
Copy link

codyps commented Aug 12, 2015

I'm not really a fan of this. Most of the time, my code is fine with newer versions of other crates. Forbidding wildcards just means that I'll try to work around it (by trying to set very high max versions) or just have to spend more time needlessly bumping versions in my Cargo.toml files, pushing new versions to git, and publishing a new version on crates.io. We'll be trading an occasional need to fix things with an ongoing desire to bump version numbers of dependencies.

Without some type of automation to handle updating Cargo.toml files like we have for Cargo.lock files, I think this is definitely a bad idea. Even with that automation, I'm concerned this will result in more things than necessary being linked into projects (more than 1 version of a crate) even in cases where 1 version of a crate would work.

@sfackler
Copy link
Member Author

"Most of the time" isn't really good enough if your constraint choices end up breaking random downstream users.

@seanmonstar
Copy link
Contributor

@jmesmon very likely scenario: your crate foo depends on crate zed with a * version. zed releases version 2.0 with breaking changes the same time you're on vacation. All users of your crate suddenly find that they are downloading [email protected], and their project is no longer compiling.

Worse still, what if you no longer care about crate foo. You've moved on to better, green crates. So you don't pay attention to when it suddenly cannot build anymore. Users scramble trying to figure out how to do so.


Compare to you depending on zed = "1.0", which will accept ALL versions of 1.x. Users will automatically get minor and patch releases, such as 1.7.89. Assuming zed follows semver, everyone is happy.

@gsingh93
Copy link

I just realized one reason I use wildcards in a lot of my projects is I don't want to bother looking up the version number of the project. I often know the name of the crate, but don't know the version off the top of my head. I'd be much more willing to ditch wildcard dependencies if I could ask Cargo to add a dependency, and Cargo would add a reasonable default version (maybe only the major and minor version numbers, i.e. 1.0) or maybe the default could be configured. There are some package managers that have this functionality (i.e. bower install jquery --save), having this in Cargo might help here.

@Ericson2314
Copy link
Contributor

As convenient as it would be, I think it is hypocritical to allow "git dependencies" but not wildcards. Both offer the same impossible promise that some library will never make a breaking change to it's interface. If we need one, I suggest we allow both, but banish package (releases) that use them to some "unsafe sublisting" of packages.

@alexcrichton Revising the bounds is done because it is impossible to get the bounds both sound and complete---because upstream revisions that haven't been made yet. I am OK with shrinking the bound to make it sound being a patch version bump---since one can rely on bugs in the code, one can rely on bugs produced by an unsound bound that nevertheless results in a successful build. But extending the bound (in attempt to make it complete) is always a harmless change---even when the bound is extended unsoundly---since lock files ensure reproducible builds.

This reminds me, does cargo attempt to avoid earlier patch versions? Since the only reason to publish a new patch version is a performance increase or bug fix, it seems between foolish and dangerous to accept earlier patch versions---don't want to get downstream "accustomed" to an upstream bug, especially one that has been fixed. By contrast, a bump in the minor version may just indicate an extended interface, in which case there is there no harm whatsoever in using the older version.

@bluss
Copy link
Member

bluss commented Aug 12, 2015

@gsingh93 Cargo will still support "*", so you can use that during initial development. Then just write in the versions you used before publishing?

@sfackler
Copy link
Member Author

@Ericson2314 git dependencies are already forbidden on crates.io. Nothing in here proposes to change that.

I tentatively think that loosening constraints after publish would be a reasonable thing to do, but that seems out of scope for this RFC.


* `*`
* `> 0.3`
* `>= 0.3`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume 0.3.* will still be ok (it is in common use). What about 0.*?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, both of those would be fine (I didn't even realize they existed!)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.* is a poor choice, but I'm not sure if it's worth explicitly prohibiting. I think we may want to wait and see.

@codyps
Copy link

codyps commented Aug 12, 2015

@seanmonstar @sfackler Isn't using/shipping a Cargo.lock file supposed to prevent some of the issues you're concerned about?

Right now downstream crates can avoid bumping the versions by controlling their Cargo.lock file a bit more directly.

Perhaps an alternate that doesn't increase the burden of maintaining crates would be to provide tools that allow more fine grained control of Cargo.lock?

@tbu-
Copy link
Contributor

tbu- commented Sep 9, 2015

In the current version the build can break if someone uses "*" you update your dependencies. This is clearly a bad thing. You can work around this by specifying a working version in your Cargo.lock.

In the proposed version, the upgrade of a library can be blocked by any crate in your dependency tree. This is also bad and cannot be worked around currently, so I don't think this RFC should be accepted until it addresses this, currently it just trades one issue for another, where the former has a work-around and the latter does not.

pick dependencies that have reasonable version restrictions, there could be a
wildcard constraint hiding five transitive levels down. Manually searching the
entire dependency graph is an exercise in frustration that shouldn't be
necessary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad version restrictions are also viral the other way around: There can always be a unneccessarily strict version constraint hiding "five transitive levels down". Manually searching the entire dependency graph is not a solution to this, as you'd have to publish your own versions of all the crates that recursively depend on this overly strict crate if the author does not update the version requirements by themselves – which means that the alternative this RFC is suggesting also carries this risk.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB. this exact point is made above #1241 (comment) .

@BurntSushi
Copy link
Member

@tbu- I'm not sure how this RFC trades one issue for another. Even today, if a crate has an overly strict version bound, then you'll have the problem you described. This RFC does nothing to change that.

@tbu-
Copy link
Contributor

tbu- commented Sep 9, 2015

@BurntSushi Well, it pushes people away from the "*" problem which has a work-around to stricter version bounds which don't have a work-around currently. Once there is a way to get around a crate's overly restrictive bounds, this RFC will pose no problem.

@daboross
Copy link

@tbu- I don't really think Cargo.lock is really a workaround, due to the fact that this RFC would only affect libraries on crates.io, and it's currently idiomatic to not commit Cargo.lock to a version repository at all for library crates.

I mean, * will still work in binary crates and local crates regardless of whether this RFC happens or not, and - idiomatically - Cargo.lock is only used in those two scenarios (binary crates and local crates).

@tbu-
Copy link
Contributor

tbu- commented Sep 10, 2015

@daboross It's a work around in the way that that root node of the dependency tree can decide about all library's version, including library's dependencies. Cargo.lock used by binary crates also works for the library crates that the binary crate transitively depends on.

But as you say, all libraries would have to rely on the binary crate actually using them maintains a proper Cargo.lock, which isn't a nice thing either... I mean I really like this RFC, but we don't have override mechanisms for cases in which an author is not available. (As another example of this: Firefox regulary updates. Firefox addons usually include the next version as compatible as well, just because the author can't immediately release the next version when the next Firefox comes out. Firefox does come with an override tool (Compatiblity Reporter), so it's not that bad if you know about that tool.)

@Ericson2314
Copy link
Contributor

The current situation is not great, but I see @tbu-'s point in that sloppy versions are a way to work around the upgrade churn issue.

@aturon
Copy link
Member

aturon commented Sep 16, 2015

cc @wycats

Also update to a warning-first rollout
@sfackler
Copy link
Member Author

Updated to only block wildcard deps, and roll out with a warning only for one release cycle.

@alexcrichton
Copy link
Member

This libs team discussed this RFC today, and decided that we were in favor but in light of the recent change by @sfackler we're going to re-up this for another week-long FCP.

@tbu-
Copy link
Contributor

tbu- commented Sep 17, 2015

I appreciate the change, thank you!

@jimmycuadra
Copy link

I have no problems with this proposal now that it has been changed to target only wildcards.

downloads every day. 50% of the [libraries that depend on it](https://crates.io/crates/openssl/reverse_dependencies)
have a wildcard constraint on the version. None of them can build against every
version that has ever been released. Indeed, no libraries can since many of
those releases can before Rust 1.0 released. In addition, almost all of them

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think came is meant here instead of can twice

@wycats
Copy link
Contributor

wycats commented Sep 23, 2015

I've been thinking about this a lot, and I have a few thoughts.

I'm in favor of forbidding wildcard dependencies, as the current draft of the RFC proposes.

Version ranges make me uneasy, because they make it much more likely that a resolved dependency graph will still fail to compile, which is something I would like to eventually rule out altogether. I am not suggesting any changes to the current draft of this RFC related to version ranges, but I'll explore this topic further.

Shared Types

The core tension here is about shared types. To give a concrete example, let's say that format-time and enriched-time both depend on a time crate.

// format-time
extern crate time;
pub fn format_time<T: time::Time>(time: T) -> String {
  // format the time
}
// enriched-time
extern crate time;

pub struct EnrichedTime { /* ... */ }
impl time::Time for EnrichedTime { /* ... */ }

If a parent crate depends on both format-time and enriched-time, and tries to pass an instance of EnrichedTime to format_time, they better both be talking about the same time crate.

Cargo's Coarse-Grained Solution

Today's Cargo attempts to address this issue coarsely:

First, Cargo will automatically coalesce dependencies within the same major version into a single shared dependency across the entire graph. This means that if format-time depends on time 1.6 and enriched-time depends on time 1.7, they will still see the identical time::Time trait.

Second, Cargo assumes compliance with semantic versioning to make this work. This assumption is fundamental: if Cargo was not allowed to assume that time 1.7 was compatible with time 1.6, format-time and enriched-time (and any other crate with a shared dependency on time) would be forced to coordinate version bumping.

Third, depending on a range of versions (time 1.x to time 3.x) means that you are depending on a subset of the types in that range that are compatible across those versions, but without any enforcement.

If you are sharing those types in your public interface, or passing instances of those types to other dependencies, this creates a "matrix" problem: the only way to be confident that your crate will compile for all downstream crates is to build your crate will all combinations of upstream versions.

Shared vs. Internal Dependencies

One of the weaknesses of the current model is that it doesn't distinguish between dependencies used purely internally, as part of your implementation and dependencies exposed as part of your public interface.

Internal-only dependencies have a much higher-tolerance for using version ranges to select a subset of a crate's types that are compatible across major versions, because the code with the internal-only dependency knows exactly how the type is being used.

For example, even if a particular method on time::Time has changed between 1.x and 2.x, if your code doesn't use it, it is still safe to depend on the 1.x to 2.x range.

In contrast, dependencies exposed as part of your public API by definition do not know all of the ways those types are being used by downstream crates. As a result, it can only use version ranges if all of the types exposed in their public interface are compatible across those versions.

The Role of Small Crates and Semantic Versioning

Instead of trying to teach all crate authors the above rules, which are devilishly hard to get right, Cargo has so far relied on relatively small crates and semantic versioning to allow crate authors to describe compatibility for a small cluster of types.

Instead of asking format-time and enriched-time to coordinate a version bump of the time dependency across every "tiny" version bump, they are required to coordinate version bumps across major versions only.

The fact that people want sometimes to express their dependencies as ranges implies two things to me:

  1. There are times where a subset of all of the types in a crate are compatible across major versions, and a crate wants to depend on that subset.
  2. People are often using a crate as an internal-only dependency, where they only have to worry about finding a compatible subset of functions, not entire types.

A Proposal

I think we can address this problem with a small improvement:

Add a new matrix-build command to Cargo, which will build your crate against all combinations of (major versions of) upstream dependencies, so that you can figure out before publishing whether your version ranges are as compatible as you think they are.

Version Ranges Are Problematic

That said, I think that depending on version ranges for crates that you use as part of your public interface is a smell.

It assumes that there is a subset of a package's types and functions that are stable across major versions.

For example, if the time crate exposes both a relatively stable time::Time trait and more unstable utility methods, people may find themselves depending on the range of time versions that describe an implicit version of the time::Time trait.

If the time crates bumps its major version, all crates that depend on a range of time will need to check whether time::Time is still compatible, and bump the range. Of course, many crates in practice will forget to do this, meaning that perfectly compatible combinations of crates that share time::Time as a public dependency will no longer compile.

In this case, it behooves the author of the time crate to break out the more stable time::Time trait (time-traits and time-utils) and version it explicitly. This allows downstream dependencies that rely on time-traits to maintain compatibility with each other even when time-utils changes.

I see dependency ranges as an interim workaround, rather than a long-term solution.

Matrix Builds

We can reduce the risk of dependency hell caused by version ranges through cargo matrix-build. I'd like to make matrix-build a sanity check done by cargo publish (as cargo build is today), to ensure that combinations of upstream dependencies will compile downstream.

However, that can only work if dependency ranges are used as a last resort, and crate authors are encouraged to break out more stable interfaces from less stable implementation.

The simple advice amounts to:

If you're a crate that exports widely shared types or interfaces, you want to create a stable crate (in the same repository) that exposes just a set of stable traits that downstream dependencies can rely on.

This doesn't mean that everyone needs to preemptively create interface crates just in case. It just means that once a crate discovers that downstream crates are often depending on version ranges, the "right thing to do" is to break out the more stable interfaces from the less stable implementation. (if you're building a crate you intend to be widely shared, maybe you should do it preemptively.)

@alexcrichton
Copy link
Member

The library subteam discussed this RFC yesterday and the conclusion was to accept it, thanks again @sfackler!

@alexcrichton alexcrichton merged commit 10d7e82 into rust-lang:master Sep 24, 2015
huonw added a commit to huonw/clap-rs that referenced this pull request Oct 1, 2015
A `*` version constraint makes it hard/impossible to release a breaking
change (e.g. clap 2.0) without causing people's code to break.  `cargo
update` with a `*` dependency will happily move from (say) 1.4 to 2.0,
even though semver says these may be a breaking change. Using a more
precise dependency constraint gives the crate author more flexibility in
making breaking changes, and makes users of the crate happier if/when
they do happen.

rust-lang/rfcs#1241 contains a bit more
discussion about this.
huonw added a commit to huonw/clap-rs that referenced this pull request Oct 1, 2015
A `*` version constraint makes it hard/impossible to release a breaking
change (e.g. clap 2.0) without causing people's code to break.  `cargo
update` with a `*` dependency will happily move from (say) 1.4 to 2.0,
even though semver says these may be a breaking change. Using a more
precise dependency constraint gives the crate author more flexibility in
making breaking changes, and makes users of the crate happier if/when
they do happen.

rust-lang/rfcs#1241 contains a bit more
discussion about this.
homu added a commit to clap-rs/clap that referenced this pull request Oct 1, 2015
Avoid suggesting star dependencies.

A `*` version constraint makes it hard/impossible to release a breaking
change (e.g. clap 2.0) without causing people's code to break.  `cargo
update` with a `*` dependency will happily move from (say) 1.4 to 2.0,
even though semver says these may be a breaking change. Using a more
precise dependency constraint gives the crate author more flexibility in
making breaking changes, and makes users of the crate happier if/when
they do happen.

rust-lang/rfcs#1241 contains a bit more
discussion about this.
@Centril Centril added A-versioning Versioning related proposals & ideas A-dependencies Proposals relating to dependencies. labels Nov 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-dependencies Proposals relating to dependencies. A-versioning Versioning related proposals & ideas final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. T-dev-tools Relevant to the development tools team, which will review and decide on the RFC. T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.