storage: MVCC benchmarks unreliable under Bazel #83599

erikgrinaker · 2022-06-29T16:57:31Z

pkg/storage has a set of MVCC benchmarks in bench_pebble_test.go, e.g. BenchmarkMVCCGet_Pebble and BenchmarkMVCCScan_Pebble. These operate on a relatively large dataset, which is pre-created if it does not already exist, but otherwise reused if one is found. This is generated by setupMVCCData, under pkg/storage/mvcc_data_*/.

When running under Bazel, e.g.:

$ dev bench pkg/storage -f "BenchmarkMVCCGet"

I often get highly unreliable results that can vary greatly between each run, especially when switching between two commits (to compare before/after benchmarks). I suspect this is some combination of Bazel reusing a previously created dataset in a working directory dependant on the build commit hash or some such. It may also be related to having cancelled the dataset creation, such that one of the builds uses a partial dataset while the other uses a complete one. I haven't dug into this. I don't see this variability with make, which stores the dataset directly in the Git clone, thus reusing the same dataset each run.

CC @rickystewart

Epic CRDB-17171
Jira issue: CRDB-17163

The text was updated successfully, but these errors were encountered:

rickystewart · 2022-06-29T20:22:08Z

@erikgrinaker Quick question, it seems that the dataset creation you're talking about should fall into the "setup" category of things that we exclude from the benchmark results, so my intuition is that there's some b.ResetTimer() (or equivalent) missing somewhere after this setup is performed. Does that sound right to you, or do we have a reason to believe that it's not "just" that simple?

erikgrinaker · 2022-06-30T09:35:13Z

Yeah, the setup is excluded from the benchmark time, it calls ResetTimer().

In any case, I think this issue started out in the wrong end -- the above is what I observed (and it's admittedly a poor writeup, I was in a hurry and just needed to scribble something down). I think what we want is for the dataset to be persisted in a common directory, not inside the Bazel build directory, such that it's properly reused by all benchmark runs regardless of the commit that's checked out. Ideally somewhere easily accessible and discoverable, since we'll need to nuke it when we make changes to the generated dataset. What would be the best way to do that with Bazel?

crdb_test enables metamorphic variables and various "extra" debug checking code paths that can interfere with performance testing. This is also consistent with how make bench behaved. If for whatever reason crdb_test is desirable the new flag '--crdb-test-on' can be used. If you want crdb_test on but no metamorphic variables the bazel test_env argument can be used like so: --test_env=COCKROACH_INTERNAL_DISABLE_METAMORPHIC_TESTING=true Informs: cockroachdb#83599 Release note: None

crdb_test enables metamorphic variables and various "extra" debug checking code paths that can interfere with performance testing. This is also consistent with how make bench behaved. If for whatever reason crdb_test is desirable just pass --crdb_test to the underlying bazel invocation. If you want crdb_test on but no metamorphic variables the bazel test_env argument can be used like so: --test_env=COCKROACH_INTERNAL_DISABLE_METAMORPHIC_TESTING=true Informs: cockroachdb#83599 Release note: None Release justification: non-production code change

85810: build: enable required release justifications for stability period r=matthewtodd a=matthewtodd Addresses RE-235. To begin stability period for the 22.2 release. Will merge Sunday night. Release justification: non-production code change Release note: None 86167: bazel: make dev bench disable crdb_test r=cucaroach a=cucaroach crdb_test enables metamorphic variables and various "extra" debug checking code paths that can interfere with performance testing. This is also consistent with how make bench behaved. If for whatever reason crdb_test is desirable the new flag '--crdb-test-on' can be used. If you want crdb_test on but no metamorphic variables the bazel test_env argument can be used like so: --test_env=COCKROACH_INTERNAL_DISABLE_METAMORPHIC_TESTING=true Informs: #83599 Release note: None Release justification: Non-production code 86183: server: propagate a testing knob to tenants r=ajwerner a=ajwerner I don't really know if this is the source of problems. I see some non-determinism in ID generation, and I see that the testing knobs passed to the testserver don't make their way to the tenant. I'm hopeful this helps, and sure that this won't hurt. Release justification: testing only change Release note: None Co-authored-by: Matthew Todd <[email protected]> Co-authored-by: Tommy Reilly <[email protected]> Co-authored-by: Andrew Werner <[email protected]>

cucaroach · 2022-08-16T16:23:13Z

Possibly fixed by #86167

jbowens · 2022-09-23T01:56:55Z

I don't think this is fixed by #86167 because the mvcc data is not made available across runs. This definitely contributes to variability, but besides that it greatly increases benchmark time because of the setup is performed every time you run ./dev bench.

cucaroach · 2022-10-04T18:03:57Z

@jbowens can you expand on what you think may be causing this and how bazel is different than make in this area?

jbowens · 2022-10-04T18:26:07Z

@erikgrinaker's comment above summarizes it pretty well:

I think what we want is for the dataset to be persisted in a common directory, not inside the Bazel build directory, such that it's properly reused by all benchmark runs regardless of the commit that's checked out. Ideally somewhere easily accessible and discoverable, since we'll need to nuke it when we make changes to the generated dataset. What would be the best way to do that with Bazel?

Many of the MVCC benchmarks require an initial database state, which is populated by setupMVCCData and written to the filesystem. If the required initial database state already exists, the benchmark does not recreate the database state. This scheme avoids needing to reconstruct the entire database state for every new benchmark invocation, and it ensures that when comparing commits, both commits run on identical databases.

With bazel, this initial database state is written to the sandbox's build directory instead of the checked out repository. When the benchmark is complete, the build directory is removed. Every run of dev bench needs to recreate the database state.

jbowens · 2022-12-14T20:28:04Z

Is there some way to lift this test data out of the bazel sandbox, so that we can reuse it between dev bench invocations?

RaduBerinde · 2023-03-25T16:52:24Z

Proposal: I am pretty sure that even when run under bazel, tests have access to the filesystem (more specifically to the user home dir). We can create a test-fixtures library which creates or reuses fixtures inside os.UserCacheDir() (something like $HOME/.cache/crdb-test-fixtures/TestName/FixtureName). The library should make sure that we don't reuse partially created fixtures (e.g. if the test gets interrupted during fixture creation, or fixture creation itself hits an error).

RaduBerinde · 2023-03-25T19:28:11Z

One wrinkle is that inside the sandbox $HOME is not set. But we could make dev bench/test pass the .cache directory through a custom env var.

ajwerner · 2023-03-26T00:55:54Z

I feel like the bazel-friendly option here would be to pull the state generation out to some other binary, generate it as a part of the bazel build step, then prosper. In the case that you're not running it with bazel, the files will continue to exist between runs and the current lazy behavior is fine.

The code to generate that initial state isn't so magical -- it seems like it could trivially be moved to a subpackage or something like that.

RaduBerinde · 2023-03-26T04:24:19Z

There are a handful of other benchmarks with the same problem. I feel it would get tedious to have separate generation binaries for each one. And these can be quite big and slow to generate, we wouldn't want to generate them unless we are actually interested in running the relevant (sub)benchmark.

RaduBerinde · 2023-03-26T19:05:28Z

I am pretty sure that even when run under bazel, tests have access to the filesystem (more specifically to the user home dir).

I think this is incorrect.

Edit: but can be made so with --sandbox_writable_path.

Certain storage-related tests/benchmarks generate fixtures and attempt to reuse them across invocations. This is important because fixtures can be large and slow to generate; but more importantly the generation is non-deterministic and we want to use the exact same fixture when comparing benchmark data. Currently the tests achieve this by using a `.gitignore`d subdirectory inside the source tree. This does not work with bazel (which executes the test in a sandbox). This commit addresses the issue by adding new test infrastructure for reusable fixtures. We use the user's `.cache` directory instead of the source tree (specifically $HOME/.cache/crdb-test-fixtures/...). For bazel, we make sure the path is available (and writable) in the sandbox and we pass the path to the test through an env var. Fixes cockroachdb#83599. Release note: None Epic: none

99174: sql: fix circular dependencies in views r=rharding6373 a=rharding6373 This change fixes node crashes that could happen due to stack overflow if views were created with circular dependencies. Fixes: #98999 Epic: none Co-authored-by: [email protected] Release note (bug fix): If views are created with circular dependencies, CRDB returns the error "cyclic view dependency for relation" instead of crashing the node. This bug is present since at least 21.1. 99624: testutils: add infrastructure for reusable test fixtures r=RaduBerinde a=RaduBerinde Certain storage-related tests/benchmarks generate fixtures and attempt to reuse them across invocations. This is important because fixtures can be large and slow to generate; but more importantly the generation is non-deterministic and we want to use the exact same fixture when comparing benchmark data. Currently the tests achieve this by using a `.gitignore`d subdirectory inside the source tree. This does not work with bazel (which executes the test in a sandbox). This commit addresses the issue by adding new test infrastructure for reusable fixtures. We use the user's `.cache` directory instead of the source tree (specifically $HOME/.cache/crdb-test-fixtures/...). For bazel, we make sure the path is available (and writable) in the sandbox and we pass the path to the test through an env var. Fixes #83599. Release note: None Epic: none 99699: spanconfig: integrate SpanConfigBounds with the Store and KVSubscriber r=ajwerner a=arulajmani This patch integrates `SpanConfigBounds` with the `KVSubscriber` and `spanconfig.Store`. The `spanconfig.Store` used by the `KVSubscriber` now has a handle to the global tenant capability state. It uses this to clamp any secondary tenant span configs that are not in conformance before returning them. By default, clamping of secondary tenant span configurations is turned off. It can be enabled using the `spanconfig.bounds.enabled` cluster setting. The setting is hidden. Fixes: #99689 Informs #99911 Release note: None 99759: roachtest: set lower min range_max_bytes in multitenant_distsql test r=rharding6373 a=rharding6373 The multitenant_distsql roachtest relies on a smaller range size than the new default min of range_max_bytes (64MiB) due to performance and resource limitations. We set the COCKROACH_MIN_RANGE_MAX_BYTES to allow the test to use the smaller range. Fixes: #99626 Epic: None Release note: None 99813: spanconfig: do not fatal in NeedsSplit and ComputeSplitKey r=irfansharif a=arulajmani See individual commits for details. Fixes #97336. 99839: schemachanger: Annotate all tables if ALTER TABLE IF EXISTS on non-existent table r=Xiang-Gu a=Xiang-Gu Previously, if table `t` does not exist, `ALTER TABLE IF EXISTS t` will only mark `t` as non-existent. This is inadequate because for stmt like `ALTER TABLE IF EXISTS t ADD FOREIGN KEY REFERENCES t_other` we will not touch `t_other` and the validation logic will later complain that `t_other` is not fully resolved nor marked as non-existent. This commit fixes it by marking all tables in this ALTER TABLE stmt as non-existent if the `t` is non-existent, so we can pass the validation. Fixes issues discovered in #99185 Epic: None 99865: roachtest: fix query used to get job status in backup/mixed-version r=srosenberg a=renatolabs At the moment, we only query job status in mixed-version state (so we should always use `system.jobs`). However, the code added in this commit should continue to work once we start developing 23.2, as we're checking that the cluster version is at least 23.1 before using `crdb_internal.system_jobs`. Epic: none Release note: None 99878: jobs: change job_info.info_key to string r=dt a=dt Release note: none. Epic: none. Hopefully we get this one in now before it is released and harder to change later. I think if we go with bytes, we'll spend the next several years typing convert_to over and over, or forgetting to and then typing it, when debugging. Co-authored-by: rharding6373 <[email protected]> Co-authored-by: Radu Berinde <[email protected]> Co-authored-by: Arul Ajmani <[email protected]> Co-authored-by: Xiang Gu <[email protected]> Co-authored-by: Renato Costa <[email protected]> Co-authored-by: David Taylor <[email protected]>

erikgrinaker added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-dev-inf labels Jun 29, 2022

rickystewart added the A-build-system label Jun 29, 2022

rickystewart assigned msbutler Jul 12, 2022

msbutler assigned cucaroach and unassigned msbutler Aug 11, 2022

cucaroach mentioned this issue Aug 15, 2022

bazel: make dev bench disable crdb_test #86167

Merged

cucaroach closed this as completed Aug 16, 2022

jbowens reopened this Sep 23, 2022

cucaroach removed their assignment Oct 4, 2022

jbowens mentioned this issue Feb 8, 2023

storageccl: fix and reenable BenchmarkTimeBoundIterate #96771

Closed

RaduBerinde mentioned this issue Feb 13, 2023

storage: some benchmarks fail in CI with crash in newPebbleIterator #97061

Closed

RaduBerinde mentioned this issue Mar 27, 2023

testutils: add infrastructure for reusable test fixtures #99624

Merged

craig bot closed this as completed in #99624 Mar 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: MVCC benchmarks unreliable under Bazel #83599

storage: MVCC benchmarks unreliable under Bazel #83599

erikgrinaker commented Jun 29, 2022 •

edited by exalate-issue-sync bot

Loading

rickystewart commented Jun 29, 2022

erikgrinaker commented Jun 30, 2022

cucaroach commented Aug 16, 2022

jbowens commented Sep 23, 2022 •

edited

Loading

cucaroach commented Oct 4, 2022

jbowens commented Oct 4, 2022

jbowens commented Dec 14, 2022

RaduBerinde commented Mar 25, 2023

RaduBerinde commented Mar 25, 2023

ajwerner commented Mar 26, 2023

RaduBerinde commented Mar 26, 2023

RaduBerinde commented Mar 26, 2023 •

edited

Loading

storage: MVCC benchmarks unreliable under Bazel #83599

storage: MVCC benchmarks unreliable under Bazel #83599

Comments

erikgrinaker commented Jun 29, 2022 • edited by exalate-issue-sync bot Loading

rickystewart commented Jun 29, 2022

erikgrinaker commented Jun 30, 2022

cucaroach commented Aug 16, 2022

jbowens commented Sep 23, 2022 • edited Loading

cucaroach commented Oct 4, 2022

jbowens commented Oct 4, 2022

jbowens commented Dec 14, 2022

RaduBerinde commented Mar 25, 2023

RaduBerinde commented Mar 25, 2023

ajwerner commented Mar 26, 2023

RaduBerinde commented Mar 26, 2023

RaduBerinde commented Mar 26, 2023 • edited Loading

erikgrinaker commented Jun 29, 2022 •

edited by exalate-issue-sync bot

Loading

jbowens commented Sep 23, 2022 •

edited

Loading

RaduBerinde commented Mar 26, 2023 •

edited

Loading