Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reset lockfile information between resolutions #8274

Merged
merged 6 commits into from
Jun 1, 2020

Conversation

Eh2406
Copy link
Contributor

@Eh2406 Eh2406 commented May 24, 2020

#8249 pointed out that some kind of lockfile data was leaking between calls to the resolver. @ehuss made a reproducing test case. This PR resets the LockedMap data structure when calling register_previous_locks.

lets see if CI likes it.
fix #8249

@rust-highfive
Copy link

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 24, 2020
Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Man would I love to remove this second call to the resolver...

This looks like it works in CI at least! Sorry I haven't been following the issue too too closely, but mind explaining a bit why this call is necessary? I'm also interested in how we haven't ever noticed this until now?

Vec<(PackageId, Vec<PackageId>)>,
>,
// The next level is keyed by the name of the package...
(SourceId, InternedString),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a small refactoring, right, not required for this change to be correct?

(ok either way, just wanted to confirm)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I did some refactorings to make debugging easier. This one felt worth keeping.

@ehuss
Copy link
Contributor

ehuss commented May 27, 2020

I think this makes sense. Can you tell me if my summary is correct?

In the test, it builds a Cargo.lock (generate-lockfile). The test then adds a new dependency, and Cargo then:

  1. Loads Cargo.lock
  2. Resolves workspace:
    2a. Locks things in PackageRegistry
    2b. Runs resolver (update due to new dependency)
    2c. Writes new Cargo.lock
  3. Build feature-specific resolver:
    3a. Locks things in PackageRegistry (again!)
    3b. Runs resolver with features selected on the command-line

The issue is that the locked entries in PackageRegistry from step 2a are no longer correct because step 2b updated the dependency graph. In step 3a, it is starting with stale entries. The solution is in step 3a to clear PackageRegistry first.

Does that sound correct? If so, I'm surprised this hasn't been a problem in the past. PackageRegistry is one of the things that I struggle to understand.

@Eh2406
Copy link
Contributor Author

Eh2406 commented May 27, 2020

mind explaining a bit why this call is necessary?

I can try, it is not clear which layers of abstraction the explanation is best in. So it may take some back and forth.

register_previous_locks takes a previous resolution or lockfile and builds a LockedMap by calling register_lock.

  • Note 1, that the package is added to the end of the list in register_lock.
  • Note 2, before this PR the LockedMap was append only.

The LockedMap is read in 2 places here and here.

  • Note 3. in both .iter().find( is used get the first thing that matches.

Ok now we can step thru the bug, at this level of abstraction.

  • The original lock file is read. The only thing in it is Foo has no dependencies.
  • The first run of the resolver looks at this LockedMap and figures out that it is outdated. (How? Different level of abstraction.) And builds the correct resolution, the same one as gened without a lockfile or with the full lockfile.
  • The second call to the resolver calls register_previous_locks with the result of the first run.
  • This adds a full lock to the end of the list for Foo specifying its dependencies. And it adds a full lock for all the transitive deps.
  • The second call to the resolver sees an inconsistent Foo has no dependencies and locked versions of the dependencies. This looks like a bug at this level of abstraction. It should just use the results from the first run not a mix. (But why did this mix make the resolver fail? Different level of abstraction)

I'm also interested in how we haven't ever noticed this until now?

IDK. One thing is that it only makes a problem for the first call to cargo after making a change to the dependencies. So it takes some clever debugging to notese it. I would have just bin like "it works now" and moved on with my life. Also the change to the dependencies need to make an inconsistent LockedMap that has a visibly different result. I wonder if we have had deps rebuilt on the second run and just not had it reported.

@Eh2406
Copy link
Contributor Author

Eh2406 commented May 28, 2020

One of the questions asked yesterday was to follow up on "But why did this mix make the resolver fail?". So I instrumented the registry_cache/summary_cache, witch caches the results of questions to the registry. I converted that into a test for the resolver and pushed a test. The test shows that the "mix lockfile" as provided to the resolver should not have a resolution.

@alexcrichton
Copy link
Member

@bors: r+

Ok I've thought about this enough that I'm ok with this. I unfortunately don't have the time right now to really dig deep into this. I can't really answer why this hasn't constantly come up in the past or how we haven't noticed this until now.

@bors
Copy link
Contributor

bors commented Jun 1, 2020

📌 Commit 8ce0d02 has been approved by alexcrichton

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 1, 2020
@bors
Copy link
Contributor

bors commented Jun 1, 2020

⌛ Testing commit 8ce0d02 with merge 5847787...

@bors
Copy link
Contributor

bors commented Jun 1, 2020

☀️ Test successful - checks-azure
Approved by: alexcrichton
Pushing 5847787 to master...

@bors bors merged commit 5847787 into rust-lang:master Jun 1, 2020
bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 4, 2020
Update cargo

9 commits in 9fcb8c1d20c17f51054f7aa4e08ff28d381fe096..40ebd52206e25c7a576ee42c137cc06a745a167a
2020-05-25 16:25:36 +0000 to 2020-06-01 22:35:00 +0000
- Warn if using hash in git URL, Fixes rust-lang/cargo#8241 (rust-lang/cargo#8297)
- reset lockfile information between resolutions (rust-lang/cargo#8274)
- Disable strip_works test on macos. (rust-lang/cargo#8301)
- Fix typo in impl Display for Strip (rust-lang/cargo#8299)
- Add support for rustdoc root URL mappings. (rust-lang/cargo#8287)
- Fix tests with enoent error message on non-english systems. (rust-lang/cargo#8295)
- Fix fingerprinting for lld on Windows with dylib. (rust-lang/cargo#8290)
- Fix a typo (rust-lang/cargo#8289)
- Fix several issues with close_output test. (rust-lang/cargo#8286)
@Eh2406 Eh2406 deleted the 8249-repro branch June 5, 2020 17:08
@ehuss ehuss added this to the 1.46.0 milestone Feb 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"cargo build" fails to select version for new dependency on first run
5 participants