NPE while collecting worker metrics #18545
Labels
help wanted
Someone outside the Bazel team could own this
P3
We're not considering working on this, but happy to review a PR. (No assignee)
team-Local-Exec
Issues and PRs for the Execution (Local) team
type: bug
Description of the bug:
We occasionally see this error in CI:
It appears to be caused by a race between this method:
And the iteration over the keys in the
workerIdToWorkerProperties
map incollectMetrics
, where it's possible for theworkerLastCallTime
map to not yet have an entry for a worker ID that is present in theworkerIdToWorkerProperties
map.Since d31dd09, this should no longer be a problem on master, but we're running an older version of Bazel.
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
No easy repro.
Which operating system are you running Bazel on?
linux, windows, macos
What is the output of
bazel info release
?release 6.1.0-ec97d6a
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.We apply some local patches, they should not affect this behaviour.
What's the output of
git remote get-url origin; git rev-parse master; git rev-parse HEAD
?No response
Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.
d233c89 introduced the race.
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
This patch probably fixes the issue by ensuring that the
workerLastCallTime
map gets populated before theworkerIdToWorkerProperties
map:We're going include that in our local patches, and should be able to confirm whether that fixes the issue
The text was updated successfully, but these errors were encountered: