Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build, x/build/cmd/coordinator: support for performance test execution #49207

Closed
prattmic opened this issue Oct 28, 2021 · 20 comments
Closed
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. Performance
Milestone

Comments

@prattmic
Copy link
Member

Background

#48803 tracks the creation of a performance monitoring system for the Go toolchain. This issue covers the first bullet: adding support to the build coordinator for running performance tests.

The performance tests we plan to run fall into one of these categories:

  • Standard “testing” package benchmarks living in the main Go repo or x/benchmarks.
  • “bent” third-party build- and micro-benchmarks.
  • Additional third party application large-scale benchmarks added to x/benchmarks.

Initial Work

Limitations

To start collecting data sooner rather than later, the initial version as described here will simplify the problem by applying the following limitations, which we intend to eventually remove:

  • Builds will be scheduled using the default coordinator priority (LIFO by commit time), rather than a more complex bisection scheme.
  • Benchmarks only run against the Go toolchain version under test, not the baseline.
  • No support for TryBot performance testing.
  • No special snapshotting for benchmark external dependencies.

MVP Design

Builds will initially run on VMs of consistent size and microarchitecture (buildlet named host-linux-amd64-perf). We will characterize the noise level using VMs and may later switch to sole tenant VMs or a dedicated physical machine to reduce noise. The benchmarks have a few external dependencies, notably rsync and perflock. These will be pre-installed in the machine image.

The new build configuration named linux-amd64-perf runs the performance tests as x/benchmarks sub-repo tests. It is configured to run only x/benchmarks tests by setting RunBench = true.

The benchmarks in x/benchmarks are not all run via go test, so runSubrepoTests will have special support for running x/benchmarks benchmarks. The tooling in x/benchmarks is still in flux to improve usability, so rather than encoding minor details about the tools into the coordinator, it will simply execute a to-be-written tool go run golang.org/x/benchmarks/cmd/bench, which is responsible for the details of running all benchmarks.

cmd/bench outputs results to stdout in the Go Benchmark Data Format. The coordinator uploads these results to perfdata.golang.org, adding additional configuration keys like:

  • Go toolchain commit
  • x/benchmarks commit
  • Build time

Old benchmark support

The coordinator has some support for running benchmarks out of x/benchmarks from 2017. This support was disabled in 2018 due to lack of support for some migrations in the coordinator. See CL 354315 for the full enumeration of this code.

This proposal will initially remove nearly all this support, as it is not relevant to the benchmarks we want to run today, and may be confusing to future readers. Some parts will be reused or repurposed, such as:

  • BuildConfig.RunBench to indicate performance test builders.
  • Client code to upload results to perfdata.golang.org.

Future Work

The design above is the minimum necessary to start running tests and collecting data, and is a starting point for future improvements we will want. Here I discuss the future changes we expect to make and the general expected design. We expect the priority and design of these to change as we learn from the running the MVP.

Baseline testing

To minimize noise from environmental changes like OS updates, we would like to run all tests against both the toolchain under test and a “fixed” baseline toolchain, which only changes occasionally (monthly?).

The main change here is to adjust runAllSharded to build both the baseline and test toolchain.

The baseline toolchain version is exposed to the tests as GOROOT_BASELINE.

To take advantage of toolchain snapshotting, we likely want to extend buildStatus.build to support individually fetching the baseline and test toolchain from different snapshots.

We expect this to be the first extension from the MVP design.

Scheduling priority

With many benchmarks and only a single buildlet, we expect that there may not be enough capacity to run every single commit. The coordinator’s currently scheduling algorithm is LIFO by commit time.

During lulls (such as weekends), the system will backfill in LIFO order, which may leave large gaps of untested commits. Instead, we would like to adjust the algorithm to prefer testing commits which will shrink the largest untested gap. i.e., effectively a binary search ordering. This order may even be an improvement to apply to all builds, not just performance tests.

Adding support for this will require plumbing more information about the completed builds into the scheduler.

Benchmark dependency snapshotting / caching

“bent” has large external dependencies fetched over the internet. “bent” fetches many third-party packages (via simple go get). Future benchmarks may fetch pre-built binary assests.

If these operations prove to be very expensive parts of testing, we may want to explore snapshotting these dependencies to save time across builds. The coordinator’s built-in snapshotting mechanism may not provide any speed boost vs fetching over the internet. Instead, read-only and checksummed copies of the dependencies could be saved on the buildlet for use across multiple builds.

The most important aspect here is to ensure that a test of a bad version of the toolchain can’t corrupt the cache in a way that breaks future builds.

cc @mknyszek @aclements @dr2chase @jeremyfaller @golang/release

@prattmic prattmic added Performance NeedsFix The path to resolution is known, but the work has not been done. labels Oct 28, 2021
@prattmic prattmic added this to the Backlog milestone Oct 28, 2021
@gopherbot gopherbot added the Builders x/build issues (builders, bots, dashboards) label Oct 28, 2021
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/354315 mentions this issue: cmd/coordinator,dashboard,internal/buildgo: remove benchmark support

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/359854 mentions this issue: cmd/bench: add benchmark wrapper

gopherbot pushed a commit to golang/build that referenced this issue Nov 4, 2021
Running benchmarks has been disabled since 2018. Remove all the code to
keep things more maintainable and understandable.

We will be adding new benchmarking support soon, and may reuse some of
this code, but don't want half-working code adding confusion.

For golang/go#49207.

Change-Id: I11d52b0315bed4d91651c162af11853895012868
Reviewed-on: https://go-review.googlesource.com/c/build/+/354315
Trust: Michael Pratt <[email protected]>
Run-TryBot: Michael Pratt <[email protected]>
TryBot-Result: Go Bot <[email protected]>
Reviewed-by: Dmitri Shuralyov <[email protected]>
Reviewed-by: Carlos Amedee <[email protected]>
Reviewed-by: Alexander Rakoczy <[email protected]>
gopherbot pushed a commit to golang/benchmarks that referenced this issue Nov 4, 2021
The coordinator is getting support for running the benchmarks in this
repository. Since the benchmarks and interface are in flux, encoding all
of the details of running Go tests, bent arguments, etc into the
coordinator will likely cause churn and frustrating migration issues.

Instead, add cmd/bench which serves as the simple entrypoint for the
coordinator. The coordinator runs cmd/bench with the GOROOT to test
(eventually multiple GOROOTs), and this binary takes care of the
remaining details.

Right now, we just do a basic go test golang.org/x/benchmarks/... and
simple invocation of bent. Note that bent does not pass without
https://golang.org/cl/354634.

For golang/go#49207

Change-Id: I5c9cf89540cab605c0a64e17af85311d37985c25
Reviewed-on: https://go-review.googlesource.com/c/benchmarks/+/359854
Trust: Michael Pratt <[email protected]>
Run-TryBot: Michael Pratt <[email protected]>
TryBot-Result: Go Bot <[email protected]>
Reviewed-by: Michael Knyszek <[email protected]>
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/361418 mentions this issue: cmd/coordinator: upload performance test results to perfdata.golang.org

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/361417 mentions this issue: cmd/coordinator: run performance tests from x/benchmarks

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/354311 mentions this issue: dashboard: add linux-amd64-perf host and builder

gopherbot pushed a commit to golang/build that referenced this issue Nov 5, 2021
Add initial support for running the performance tests from x/benchmarks.
Since there are a variety of different test suites (some `go test`,
bent, etc), x/benchmarks provides a basic wrapper command,
golang.org/x/benchmarks/cmd/bench which know the minute details. The
coordinator just needs to run that one command.

This build mode is limited to builds of x/benchmarks on builders with
RunBench set to true. Currently there are none, a future CL will add the
initial such linux-amd64 builder.

For golang/go#49207

Change-Id: Ie006ec4a3757a5c2fed0925a3f9eb91edeaa5224
Reviewed-on: https://go-review.googlesource.com/c/build/+/361417
Trust: Michael Pratt <[email protected]>
Run-TryBot: Michael Pratt <[email protected]>
TryBot-Result: Go Bot <[email protected]>
Reviewed-by: Alexander Rakoczy <[email protected]>
gopherbot pushed a commit to golang/build that referenced this issue Nov 5, 2021
The performance test output contains benchfmt-formatted benchmark
results. Upload the output wholesale to perfdata.golang.org for
long-term storage and analysis.

The results for now are a bit rough, as the output may also contain
unrelated output that lines that look like benchfmt. For example. "go:
downloading github.com/BurntSushi/toml v0.3.1" adds a "go" label with
the value "downloading ...". In the future, we will ideally filter these
a bit better (perhaps in x/benchmarks/cmd/bench).

For golang/go#49207

Change-Id: Ifd2512c93902a74f9040db0f9d0c600348fc1849
Reviewed-on: https://go-review.googlesource.com/c/build/+/361418
Reviewed-by: Alexander Rakoczy <[email protected]>
Trust: Michael Pratt <[email protected]>
Run-TryBot: Michael Pratt <[email protected]>
TryBot-Result: Go Bot <[email protected]>
gopherbot pushed a commit to golang/build that referenced this issue Nov 5, 2021
Add a new builder to run the x/benchmarks performance tests on
linux-amd64.

For now, this runs on a GCE C2 instance type, as these instances have
well-defined, consistent CPUs and other server architecture components.

In basic noise testing, even standard VMs of this type appear to be
fairly low noise. As we gain experience with actual monitoring, we may
change this to a sole-tenant VM type or even a dedicated machine if
necessary.

For golang/go#49207

Change-Id: I17eaeeb5349af925249940bebd5b860a2579e6df
Reviewed-on: https://go-review.googlesource.com/c/build/+/354311
Trust: Michael Pratt <[email protected]>
Run-TryBot: Michael Pratt <[email protected]>
TryBot-Result: Go Bot <[email protected]>
Reviewed-by: Alexander Rakoczy <[email protected]>
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/361656 mentions this issue: internal/coordinator/pool: count C2 and N2 quotas separately

gopherbot pushed a commit to golang/build that referenced this issue Nov 5, 2021
We currently use E2, C2, and N2 instances on GCE. C2 and N2 instances
have their own quotas, which are accounted separately from the CPUS
quotas.

This could probably be cleaned up to keep track of all CPU quotas and
handle more instance types, but this should work for the time being.

See: https://cloud.google.com/compute/quotas#cpu_quota

For golang/go#49207

Change-Id: Ida1e8de3c857560637095d57e972bca7222284ed
Reviewed-on: https://go-review.googlesource.com/c/build/+/361656
Trust: Alexander Rakoczy <[email protected]>
Run-TryBot: Alexander Rakoczy <[email protected]>
TryBot-Result: Go Bot <[email protected]>
Reviewed-by: Heschi Kreinick <[email protected]>
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/361734 mentions this issue: cmd/bent: remove required dependencies

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/361754 mentions this issue: dashboard: SkipSnapshot for linux-amd64-perf

gopherbot pushed a commit to golang/build that referenced this issue Nov 5, 2021
Since this builder doesn't build the go repo, it will be waiting forever
for a snapshot. Instead, just build Go for each run.

For golang/go#49207

Change-Id: I34a73b507278db402c478b4f5956633996772aae
Reviewed-on: https://go-review.googlesource.com/c/build/+/361754
Trust: Alexander Rakoczy <[email protected]>
Run-TryBot: Alexander Rakoczy <[email protected]>
TryBot-Result: Go Bot <[email protected]>
Reviewed-by: Dmitri Shuralyov <[email protected]>
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/361874 mentions this issue: cmd/coordinator: add basic metadata to perfdata upload

gopherbot pushed a commit to golang/benchmarks that referenced this issue Nov 5, 2021
Make rsync optional with fallback to cp.

Remove use of /usr/bin/time and replace with measuring time directly
from Go.

For golang/go#49207

Change-Id: Ief5a7a90f9460ddec1d5a51c99d4a534e38a5d9c
Reviewed-on: https://go-review.googlesource.com/c/benchmarks/+/361734
Trust: Michael Pratt <[email protected]>
Run-TryBot: Michael Pratt <[email protected]>
TryBot-Result: Go Bot <[email protected]>
Reviewed-by: Cherry Mui <[email protected]>
Reviewed-by: David Chase <[email protected]>
gopherbot pushed a commit to golang/build that referenced this issue Nov 8, 2021
These make it possible to tell what was run, as well as a convenience
field stating whether this was a post-submit build or a trybot run.

For golang/go#49207

Change-Id: Iba979bcfd5a3bbdc11e2df0b8de4094cc7212356
Reviewed-on: https://go-review.googlesource.com/c/build/+/361874
Trust: Michael Pratt <[email protected]>
Run-TryBot: Michael Pratt <[email protected]>
TryBot-Result: Go Bot <[email protected]>
Reviewed-by: Alexander Rakoczy <[email protected]>
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/362375 mentions this issue: cmd/bench: wait for load average to drop before starting

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/376096 mentions this issue: cmd/bench: benchmark baseline toolchain

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/376634 mentions this issue: cmd/coordinator: baseline toolchain for benchmarks

gopherbot pushed a commit to golang/benchmarks that referenced this issue Jan 10, 2022
If BENCH_BASELINE_GOROOT is set, additionally benchmark that toolchain.
The benchfmt label 'toolchain' differentiates the 'experiment' and
'baseline' toolchains.

For golang/go#49207.

Change-Id: I737fa56786dc482172942462c5776c4c2773c0c5
Reviewed-on: https://go-review.googlesource.com/c/benchmarks/+/376096
Trust: Michael Pratt <[email protected]>
Run-TryBot: Michael Pratt <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: Michael Knyszek <[email protected]>
gopherbot pushed a commit to golang/build that referenced this issue Jan 10, 2022
When benchmarking, we want to benchmark both the toolchain under test
(i.e., buildStatus.Rev) as well as an older "baseline" toolchain, which
will be compared against.

For now, the baseline toolchain is the latest stable release. In the
future we may want to update more frequently, but this is a simple
starting point.

This CL determines the baseline toolchain commit for a given test and
installs it on the buildlet at BENCH_BASELINE_GOROOT.
golang.org/x/benchmarks/cmd/bench is responsible for utilizing the
baseline toolchain. CL 376096 is the corresponding change to cmd/bench.

Most of the baseline toolchain logic is limited to runBenchmarkTests().
In theory, it logically fits a bit better with the rest of the toolchain
logic in build() et al, but keeping it limited to runBenchmarkTests()
helps keep the common build() path from getting much more complex for a
minor edge-case feature.

For golang/go#49207.
For golang/go#48803.

Change-Id: Id63f8333cf9d1ff952850c3347e999b5e98f7294
Reviewed-on: https://go-review.googlesource.com/c/build/+/376634
Reviewed-by: Alex Rakoczy <[email protected]>
Trust: Michael Pratt <[email protected]>
Run-TryBot: Michael Pratt <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
gopherbot pushed a commit to golang/benchmarks that referenced this issue Jan 10, 2022
The coordinator runs tests on a freshly booted VM which may still be
running background boot tasks when bench starts.

For minimal noise, wait for the system load average to drop (indicating
background tasks have completed) before continuing with benchmarking.

For golang/go#49207.

Change-Id: I8df01592fea31d49eae54074213e202b21d5728a
Reviewed-on: https://go-review.googlesource.com/c/benchmarks/+/362375
Reviewed-by: Michael Knyszek <[email protected]>
Trust: Michael Pratt <[email protected]>
Run-TryBot: Michael Pratt <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/378214 mentions this issue: cmd/bench: move toolchain selection closer to execution

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/378274 mentions this issue: cmd/bench: integrate the Sweet benchmarks

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/378336 mentions this issue: dashboard: extend perf builder timeout

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/378334 mentions this issue: dashboard,internal/coordinator/pool: VM delete timeout from host config

gopherbot pushed a commit to golang/build that referenced this issue Jan 13, 2022
Allow individual host configurations to override the VM delete timeout
if they are using for longer than normal builds.

For golang/go#49207.

Change-Id: I9c5c80e5ee7dac2375cff17c64871ae2211f6309
Reviewed-on: https://go-review.googlesource.com/c/build/+/378334
Trust: Michael Pratt <[email protected]>
Run-TryBot: Michael Pratt <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: Dmitri Shuralyov <[email protected]>
Reviewed-by: Alex Rakoczy <[email protected]>
gopherbot pushed a commit to golang/benchmarks that referenced this issue Jan 18, 2022
Right now we "iterate" over toolchains (GOROOTs) at the outer-most part
of the tool, but bringing that in closer lets us do things like only
build the benchmarking tools once.

This change also introduces abstractions around the Go tool from the
Sweet tool to simplify and deduplicate some code. For instance, building
bent currently fails with the baseline GOROOT because the GOROOT
environment variable isn't set correctly, but the "gotest" benchmarks
do.

For golang/go#49207.

Change-Id: I6816e1112174f951d3bc22c2b1033b8e98dc0327
Reviewed-on: https://go-review.googlesource.com/c/benchmarks/+/378214
Reviewed-by: Michael Pratt <[email protected]>
Trust: Michael Knyszek <[email protected]>
Run-TryBot: Michael Knyszek <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: David Chase <[email protected]>
gopherbot pushed a commit to golang/build that referenced this issue Jan 19, 2022
The benchmarks in the perf builder may take several hours to complete.
Extend the VM deletion timeout so that they stick around long enough to
complete benchmarking.

For golang/go#49207.

Change-Id: I3e9d2a1df657406ef0f80b9c0cb713df3b716ca8
Reviewed-on: https://go-review.googlesource.com/c/build/+/378336
Trust: Michael Pratt <[email protected]>
Run-TryBot: Michael Pratt <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: Dmitri Shuralyov <[email protected]>
Reviewed-by: Alex Rakoczy <[email protected]>
Reviewed-by: Carlos Amedee <[email protected]>
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/382097 mentions this issue: cmd/bench: run 10 iterations for each bent benchmark

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/382894 mentions this issue: env: include make in linux/amd64 builder imagesk

gopherbot pushed a commit to golang/build that referenced this issue Feb 3, 2022
Some benchmarks in x/benchmarks from external sources wrap the go tool
in make. Add make to the linux/amd64 builders where these benchmarks will
run.

For golang/go#49207.

Change-Id: I4ea16c0aa63d1b520c61d0a2b9dabffdd8bb7094
Reviewed-on: https://go-review.googlesource.com/c/build/+/382894
Trust: Michael Knyszek <[email protected]>
Run-TryBot: Michael Knyszek <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: Carlos Amedee <[email protected]>
gopherbot pushed a commit to golang/benchmarks that referenced this issue Feb 4, 2022
For golang/go#49207.

Change-Id: Ib18c5f574e30333a7d9d80019e26d6a565f4db1e
Reviewed-on: https://go-review.googlesource.com/c/benchmarks/+/378274
Reviewed-by: Michael Pratt <[email protected]>
Trust: Michael Knyszek <[email protected]>
gopherbot pushed a commit to golang/benchmarks that referenced this issue Feb 9, 2022
For golang/go#49207.

Change-Id: I83fa87a603cf26ed61d324975166388db1801487
Reviewed-on: https://go-review.googlesource.com/c/benchmarks/+/382097
Reviewed-by: David Chase <[email protected]>
Reviewed-by: Michael Pratt <[email protected]>
Trust: Michael Knyszek <[email protected]>
Run-TryBot: Michael Knyszek <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/397655 mentions this issue: cmd/coordinator: record commit time in RFC3339

gopherbot pushed a commit to golang/build that referenced this issue Apr 4, 2022
This is consistent with the format used by perfdata.golang.org for
upload-time and x/perf/cmd/bench for runstamp.

For golang/go#49207.

Change-Id: I0c800629c23eb830803d3017806ca6c9c8907b87
Reviewed-on: https://go-review.googlesource.com/c/build/+/397655
Trust: Michael Pratt <[email protected]>
Run-TryBot: Michael Pratt <[email protected]>
Reviewed-by: Michael Knyszek <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
@dmitshur dmitshur changed the title x/build: coordinator support for performance test execution x/build, x/build/cmd/coordinator: support for performance test execution May 31, 2022
@prattmic
Copy link
Member Author

There will likely be more follow-up work here (some of the "future work" items), but they aren't planned right now and the core work is done.

@dmitshur dmitshur modified the milestones: Backlog, Unreleased May 31, 2022
@golang golang locked and limited conversation to collaborators May 31, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. Performance
Projects
None yet
Development

No branches or pull requests

3 participants