Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deduplicate locally executed path mapped spawns #22556

Closed
wants to merge 14 commits into from

Conversation

fmeum
Copy link
Collaborator

@fmeum fmeum commented May 27, 2024

When path mapping is enabled, different Spawns in the same build can have identical RemoteAction.ActionKeys and can thus provide remote cache hits for each other. However, cache hits are only possible after the first local execution has concluded and uploaded its result to the cache.

To avoid unnecessary duplication of local work, the first Spawn for each RemoteAction.ActionKey is tracked until its results have been uploaded and all other concurrently scheduled Spawns wait for it and then copy over its local outputs.

Fixes #21043

@fmeum fmeum force-pushed the 21043-deduplicate-action branch from 27e14e8 to ef83b2a Compare May 28, 2024 20:05
@fmeum fmeum marked this pull request as ready for review May 28, 2024 20:05
@fmeum fmeum requested a review from a team as a code owner May 28, 2024 20:05
@fmeum fmeum requested a review from tjgq May 28, 2024 20:05
@github-actions github-actions bot added team-Performance Issues for Performance teams team-Remote-Exec Issues and PRs for the Execution (Remote) team awaiting-review PR is awaiting review from an assigned reviewer labels May 28, 2024
@fmeum
Copy link
Collaborator Author

fmeum commented Jun 14, 2024

@tjgq Gentle ping

@fmeum fmeum requested a review from tjgq July 19, 2024 10:43
@tjgq
Copy link
Contributor

tjgq commented Jul 22, 2024

I'll import this myself.

@fmeum
Copy link
Collaborator Author

fmeum commented Jul 22, 2024

@bazel-io fork 7.3.0

@github-actions github-actions bot removed the awaiting-review PR is awaiting review from an assigned reviewer label Jul 23, 2024
fmeum added a commit to fmeum/bazel that referenced this pull request Jul 23, 2024
When path mapping is enabled, different `Spawn`s in the same build can have identical `RemoteAction.ActionKey`s and can thus provide remote cache hits for each other. However, cache hits are only possible after the first local execution has concluded and uploaded its result to the cache.

To avoid unnecessary duplication of local work, the first `Spawn` for each `RemoteAction.ActionKey` is tracked until its results have been uploaded and all other concurrently scheduled `Spawn`s wait for it and then copy over its local outputs.

Fixes bazelbuild#21043

Closes bazelbuild#22556.

PiperOrigin-RevId: 655097996
Change-Id: I4368f9210c67a306775164d252aae122d8b46f9b
@fmeum fmeum deleted the 21043-deduplicate-action branch July 23, 2024 10:32
github-merge-queue bot pushed a commit that referenced this pull request Jul 29, 2024
When path mapping is enabled, different `Spawn`s in the same build can
have identical `RemoteAction.ActionKey`s and can thus provide remote
cache hits for each other. However, cache hits are only possible after
the first local execution has concluded and uploaded its result to the
cache.

To avoid unnecessary duplication of local work, the first `Spawn` for
each `RemoteAction.ActionKey` is tracked until its results have been
uploaded and all other concurrently scheduled `Spawn`s wait for it and
then copy over its local outputs.

Fixes #21043

Closes #22556.

PiperOrigin-RevId: 655097996
Change-Id: I4368f9210c67a306775164d252aae122d8b46f9b

Closes #23060
@fmeum fmeum mentioned this pull request Sep 19, 2024
github-merge-queue bot pushed a commit that referenced this pull request Sep 19, 2024
Cherry-picks the following changes to implement output reuse:
* Deduplicate locally executed path mapped spawns (#22556)
* Fix local execution deduplication to work with optional outputs
(#23296)
* Force synchronous upload and reuse of possibly modified spawn outputs
(#23382)
* Add support for in-memory outputs to output reuse (#23422)

Fixes #23377
Fixes #23444
Fixes #23457
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Performance Issues for Performance teams team-Remote-Exec Issues and PRs for the Execution (Remote) team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

--experimental_output_paths=strip is not effective when actions are scheduled in parallel
2 participants