Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: MVCC benchmarks unreliable under Bazel #83599

Closed
erikgrinaker opened this issue Jun 29, 2022 · 12 comments · Fixed by #99624
Closed

storage: MVCC benchmarks unreliable under Bazel #83599

erikgrinaker opened this issue Jun 29, 2022 · 12 comments · Fixed by #99624
Labels
A-build-system C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-dev-inf

Comments

@erikgrinaker
Copy link
Contributor

erikgrinaker commented Jun 29, 2022

pkg/storage has a set of MVCC benchmarks in bench_pebble_test.go, e.g. BenchmarkMVCCGet_Pebble and BenchmarkMVCCScan_Pebble. These operate on a relatively large dataset, which is pre-created if it does not already exist, but otherwise reused if one is found. This is generated by setupMVCCData, under pkg/storage/mvcc_data_*/.

When running under Bazel, e.g.:

$ dev bench pkg/storage -f "BenchmarkMVCCGet"

I often get highly unreliable results that can vary greatly between each run, especially when switching between two commits (to compare before/after benchmarks). I suspect this is some combination of Bazel reusing a previously created dataset in a working directory dependant on the build commit hash or some such. It may also be related to having cancelled the dataset creation, such that one of the builds uses a partial dataset while the other uses a complete one. I haven't dug into this. I don't see this variability with make, which stores the dataset directly in the Git clone, thus reusing the same dataset each run.

CC @rickystewart

Epic CRDB-17171
Jira issue: CRDB-17163

@erikgrinaker erikgrinaker added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-dev-inf labels Jun 29, 2022
@rickystewart
Copy link
Collaborator

@erikgrinaker Quick question, it seems that the dataset creation you're talking about should fall into the "setup" category of things that we exclude from the benchmark results, so my intuition is that there's some b.ResetTimer() (or equivalent) missing somewhere after this setup is performed. Does that sound right to you, or do we have a reason to believe that it's not "just" that simple?

@erikgrinaker
Copy link
Contributor Author

Yeah, the setup is excluded from the benchmark time, it calls ResetTimer().

In any case, I think this issue started out in the wrong end -- the above is what I observed (and it's admittedly a poor writeup, I was in a hurry and just needed to scribble something down). I think what we want is for the dataset to be persisted in a common directory, not inside the Bazel build directory, such that it's properly reused by all benchmark runs regardless of the commit that's checked out. Ideally somewhere easily accessible and discoverable, since we'll need to nuke it when we make changes to the generated dataset. What would be the best way to do that with Bazel?

@msbutler msbutler assigned cucaroach and unassigned msbutler Aug 11, 2022
cucaroach added a commit to cucaroach/cockroach that referenced this issue Aug 15, 2022
crdb_test enables metamorphic variables and various "extra" debug
checking code paths that can interfere with performance testing. This
is also consistent with how make bench behaved.

If for whatever reason crdb_test is desirable the new flag
'--crdb-test-on' can be used.

If you want crdb_test on but no metamorphic variables the bazel test_env
argument can be used like so:

--test_env=COCKROACH_INTERNAL_DISABLE_METAMORPHIC_TESTING=true

Informs: cockroachdb#83599

Release note: None
cucaroach added a commit to cucaroach/cockroach that referenced this issue Aug 16, 2022
crdb_test enables metamorphic variables and various "extra" debug
checking code paths that can interfere with performance testing. This
is also consistent with how make bench behaved.

If for whatever reason crdb_test is desirable just pass --crdb_test to
the underlying bazel invocation.

If you want crdb_test on but no metamorphic variables the bazel test_env
argument can be used like so:

--test_env=COCKROACH_INTERNAL_DISABLE_METAMORPHIC_TESTING=true

Informs: cockroachdb#83599

Release note: None

Release justification: non-production code change
craig bot pushed a commit that referenced this issue Aug 16, 2022
85810: build: enable required release justifications for stability period r=matthewtodd a=matthewtodd

Addresses RE-235.

To begin stability period for the 22.2 release.

Will merge Sunday night.

Release justification: non-production code change
Release note: None

86167: bazel: make dev bench disable crdb_test r=cucaroach a=cucaroach

crdb_test enables metamorphic variables and various "extra" debug
checking code paths that can interfere with performance testing. This
is also consistent with how make bench behaved.

If for whatever reason crdb_test is desirable the new flag
'--crdb-test-on' can be used.

If you want crdb_test on but no metamorphic variables the bazel test_env
argument can be used like so:

--test_env=COCKROACH_INTERNAL_DISABLE_METAMORPHIC_TESTING=true

Informs: #83599

Release note: None
Release justification: Non-production code


86183: server: propagate a testing knob to tenants r=ajwerner a=ajwerner

I don't really know if this is the source of problems. I see some
non-determinism in ID generation, and I see that the testing knobs
passed to the testserver don't make their way to the tenant. I'm
hopeful this helps, and sure that this won't hurt.

Release justification: testing only change

Release note: None

Co-authored-by: Matthew Todd <[email protected]>
Co-authored-by: Tommy Reilly <[email protected]>
Co-authored-by: Andrew Werner <[email protected]>
@cucaroach
Copy link
Contributor

Possibly fixed by #86167

@jbowens
Copy link
Collaborator

jbowens commented Sep 23, 2022

I don't think this is fixed by #86167 because the mvcc data is not made available across runs. This definitely contributes to variability, but besides that it greatly increases benchmark time because of the setup is performed every time you run ./dev bench.

@jbowens jbowens reopened this Sep 23, 2022
@cucaroach cucaroach removed their assignment Oct 4, 2022
@cucaroach
Copy link
Contributor

@jbowens can you expand on what you think may be causing this and how bazel is different than make in this area?

@jbowens
Copy link
Collaborator

jbowens commented Oct 4, 2022

@erikgrinaker's comment above summarizes it pretty well:

I think what we want is for the dataset to be persisted in a common directory, not inside the Bazel build directory, such that it's properly reused by all benchmark runs regardless of the commit that's checked out. Ideally somewhere easily accessible and discoverable, since we'll need to nuke it when we make changes to the generated dataset. What would be the best way to do that with Bazel?

Many of the MVCC benchmarks require an initial database state, which is populated by setupMVCCData and written to the filesystem. If the required initial database state already exists, the benchmark does not recreate the database state. This scheme avoids needing to reconstruct the entire database state for every new benchmark invocation, and it ensures that when comparing commits, both commits run on identical databases.

With bazel, this initial database state is written to the sandbox's build directory instead of the checked out repository. When the benchmark is complete, the build directory is removed. Every run of dev bench needs to recreate the database state.

@jbowens
Copy link
Collaborator

jbowens commented Dec 14, 2022

Is there some way to lift this test data out of the bazel sandbox, so that we can reuse it between dev bench invocations?

@RaduBerinde
Copy link
Member

Proposal: I am pretty sure that even when run under bazel, tests have access to the filesystem (more specifically to the user home dir). We can create a test-fixtures library which creates or reuses fixtures inside os.UserCacheDir() (something like $HOME/.cache/crdb-test-fixtures/TestName/FixtureName). The library should make sure that we don't reuse partially created fixtures (e.g. if the test gets interrupted during fixture creation, or fixture creation itself hits an error).

@RaduBerinde
Copy link
Member

One wrinkle is that inside the sandbox $HOME is not set. But we could make dev bench/test pass the .cache directory through a custom env var.

@ajwerner
Copy link
Contributor

I feel like the bazel-friendly option here would be to pull the state generation out to some other binary, generate it as a part of the bazel build step, then prosper. In the case that you're not running it with bazel, the files will continue to exist between runs and the current lazy behavior is fine.

The code to generate that initial state isn't so magical -- it seems like it could trivially be moved to a subpackage or something like that.

@RaduBerinde
Copy link
Member

There are a handful of other benchmarks with the same problem. I feel it would get tedious to have separate generation binaries for each one. And these can be quite big and slow to generate, we wouldn't want to generate them unless we are actually interested in running the relevant (sub)benchmark.

@RaduBerinde
Copy link
Member

RaduBerinde commented Mar 26, 2023

I am pretty sure that even when run under bazel, tests have access to the filesystem (more specifically to the user home dir).

I think this is incorrect.

Edit: but can be made so with --sandbox_writable_path.

RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Mar 27, 2023
Certain storage-related tests/benchmarks generate fixtures and attempt
to reuse them across invocations. This is important because fixtures
can be large and slow to generate; but more importantly the generation
is non-deterministic and we want to use the exact same fixture when
comparing benchmark data.

Currently the tests achieve this by using a `.gitignore`d subdirectory
inside the source tree. This does not work with bazel (which executes
the test in a sandbox).

This commit addresses the issue by adding new test infrastructure for
reusable fixtures. We use the user's `.cache` directory instead of the
source tree (specifically $HOME/.cache/crdb-test-fixtures/...).
For bazel, we make sure the path is available (and writable) in the
sandbox and we pass the path to the test through an env var.

Fixes cockroachdb#83599.

Release note: None
Epic: none
RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Mar 29, 2023
Certain storage-related tests/benchmarks generate fixtures and attempt
to reuse them across invocations. This is important because fixtures
can be large and slow to generate; but more importantly the generation
is non-deterministic and we want to use the exact same fixture when
comparing benchmark data.

Currently the tests achieve this by using a `.gitignore`d subdirectory
inside the source tree. This does not work with bazel (which executes
the test in a sandbox).

This commit addresses the issue by adding new test infrastructure for
reusable fixtures. We use the user's `.cache` directory instead of the
source tree (specifically $HOME/.cache/crdb-test-fixtures/...).
For bazel, we make sure the path is available (and writable) in the
sandbox and we pass the path to the test through an env var.

Fixes cockroachdb#83599.

Release note: None
Epic: none
craig bot pushed a commit that referenced this issue Mar 30, 2023
99174: sql: fix circular dependencies in views r=rharding6373 a=rharding6373

This change fixes node crashes that could happen due to stack overflow if views were created with circular dependencies.

Fixes: #98999
Epic: none
Co-authored-by: [email protected]

Release note (bug fix): If views are created with circular dependencies, CRDB returns the error "cyclic view dependency for relation" instead of crashing the node. This bug is present since at least 21.1.

99624: testutils: add infrastructure for reusable test fixtures r=RaduBerinde a=RaduBerinde

Certain storage-related tests/benchmarks generate fixtures and attempt to reuse them across invocations. This is important because fixtures can be large and slow to generate; but more importantly the generation is non-deterministic and we want to use the exact same fixture when comparing benchmark data.

Currently the tests achieve this by using a `.gitignore`d subdirectory inside the source tree. This does not work with bazel (which executes the test in a sandbox).

This commit addresses the issue by adding new test infrastructure for reusable fixtures. We use the user's `.cache` directory instead of the source tree (specifically $HOME/.cache/crdb-test-fixtures/...). For bazel, we make sure the path is available (and writable) in the sandbox and we pass the path to the test through an env var.

Fixes #83599.

Release note: None
Epic: none

99699: spanconfig: integrate SpanConfigBounds with the Store and KVSubscriber r=ajwerner a=arulajmani

This patch integrates `SpanConfigBounds` with the `KVSubscriber` and
`spanconfig.Store`. The `spanconfig.Store` used by the `KVSubscriber`
now has a handle to the global tenant capability state. It uses this to
clamp any secondary tenant span configs that are not in conformance
before returning them.

By default, clamping of secondary tenant span configurations is turned
off. It can be enabled using the `spanconfig.bounds.enabled` cluster
setting. The setting is hidden.

Fixes: #99689
Informs #99911

Release note: None

99759: roachtest: set lower min range_max_bytes in multitenant_distsql test r=rharding6373 a=rharding6373

The multitenant_distsql roachtest relies on a smaller range size than the new default min of range_max_bytes (64MiB) due to performance and resource limitations. We set the COCKROACH_MIN_RANGE_MAX_BYTES to allow the test to use the smaller range.

Fixes: #99626
Epic: None

Release note: None

99813: spanconfig: do not fatal in NeedsSplit and ComputeSplitKey r=irfansharif a=arulajmani

See individual commits for details. 

Fixes #97336.

99839: schemachanger: Annotate all tables if ALTER TABLE IF EXISTS on non-existent table r=Xiang-Gu a=Xiang-Gu

Previously, if table `t` does not exist, `ALTER TABLE IF EXISTS t` will only mark `t` as non-existent. This is inadequate because for stmt like `ALTER TABLE IF EXISTS t ADD FOREIGN KEY REFERENCES t_other` we will not touch `t_other` and the validation logic will later complain that `t_other` is not fully resolved nor marked as non-existent. This commit fixes it by marking all tables in this ALTER TABLE stmt as non-existent if the `t` is non-existent, so we can pass the validation.

Fixes issues discovered in #99185
Epic: None

99865: roachtest: fix query used to get job status in backup/mixed-version r=srosenberg a=renatolabs

At the moment, we only query job status in mixed-version state (so we should always use `system.jobs`). However, the code added in this commit should continue to work once we start developing 23.2, as we're checking that the cluster version is at least 23.1 before using `crdb_internal.system_jobs`.

Epic: none

Release note: None

99878: jobs: change job_info.info_key to string r=dt a=dt

Release note: none.
Epic: none.

Hopefully we get this one in now before it is released and harder to change later. I think if we go with bytes, we'll spend the next several years typing convert_to over and over, or forgetting to and then typing it, when debugging.

Co-authored-by: rharding6373 <[email protected]>
Co-authored-by: Radu Berinde <[email protected]>
Co-authored-by: Arul Ajmani <[email protected]>
Co-authored-by: Xiang Gu <[email protected]>
Co-authored-by: Renato Costa <[email protected]>
Co-authored-by: David Taylor <[email protected]>
@craig craig bot closed this as completed in #99624 Mar 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-build-system C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-dev-inf
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants